Perception of Paralinguistic Traits in Synthesized Voices

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Standard

Perception of Paralinguistic Traits in Synthesized Voices. / Baird, Alice Emily; Hasse Jørgensen, Stina; Parada-Cabaleiro, Emilia; Hantke, Simone; Cummins, Nicholas; Schuller, Bjorn .

Proceedings of the 12th International Audio Mostly Conference : Augmented and Participatory Sound and Music Experiences : AM '17. New York : Association for Computing Machinery, 2017. 17.

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Harvard

Baird, AE, Hasse Jørgensen, S, Parada-Cabaleiro, E, Hantke, S, Cummins, N & Schuller, B 2017, Perception of Paralinguistic Traits in Synthesized Voices. in Proceedings of the 12th International Audio Mostly Conference : Augmented and Participatory Sound and Music Experiences : AM '17., 17, Association for Computing Machinery, New York, Audio Mostly, London, United Kingdom, 23/08/2017. https://doi.org/10.1145/3123514.3123528

APA

Baird, A. E., Hasse Jørgensen, S., Parada-Cabaleiro, E., Hantke, S., Cummins, N., & Schuller, B. (2017). Perception of Paralinguistic Traits in Synthesized Voices. In Proceedings of the 12th International Audio Mostly Conference : Augmented and Participatory Sound and Music Experiences : AM '17 [17] Association for Computing Machinery. https://doi.org/10.1145/3123514.3123528

Vancouver

Baird AE, Hasse Jørgensen S, Parada-Cabaleiro E, Hantke S, Cummins N, Schuller B. Perception of Paralinguistic Traits in Synthesized Voices. In Proceedings of the 12th International Audio Mostly Conference : Augmented and Participatory Sound and Music Experiences : AM '17. New York: Association for Computing Machinery. 2017. 17 https://doi.org/10.1145/3123514.3123528

Author

Baird, Alice Emily ; Hasse Jørgensen, Stina ; Parada-Cabaleiro, Emilia ; Hantke, Simone ; Cummins, Nicholas ; Schuller, Bjorn . / Perception of Paralinguistic Traits in Synthesized Voices. Proceedings of the 12th International Audio Mostly Conference : Augmented and Participatory Sound and Music Experiences : AM '17. New York : Association for Computing Machinery, 2017.

Bibtex

@inproceedings{1ea160e3864b4430a58ec0c70a425273,

title = "Perception of Paralinguistic Traits in Synthesized Voices",

abstract = "Along with the rise of artificial intelligence and the internet-of-things, synthesized voices are now common in daily–life, providing us with guidance, assistance, and even companionship. From formant to concatenative synthesis, the synthesized voice continues to be defined by the same traits we prescribe to ourselves. When the recorded voice is synthesized, does our perception of its new machine embodiment change, and can we consider an alternative, more inclusive form? To begin evaluating the impact of aesthetic design, this study presents a first–step perception test to explore the paralinguistic traits of the synthesized voice. Using a corpus of 13 synthesized voices, constructed from acoustic concatenative speech synthesis, we assessed the response of 23 listeners from differing cultural backgrounds. Evaluating if the perception shifts from the known ground–truths, we asked listeners to assigned traits of age, gender, accent origin, and human–likeness. Results present a difference in perception for age and human–likeness across voices, and a general agreement across listeners for both gender and accent origin. Connections found between age, gender and human–likeness call for further exploration into a more participatory and inclusive synthesized vocal identity.",

keywords = "Faculty of Humanities, user studies, human-centered computing",

author = "Baird, {Alice Emily} and {Hasse J{\o}rgensen}, Stina and Emilia Parada-Cabaleiro and Simone Hantke and Nicholas Cummins and Bjorn Schuller",

year = "2017",

doi = "10.1145/3123514.3123528",

language = "English",

booktitle = "Proceedings of the 12th International Audio Mostly Conference",

publisher = "Association for Computing Machinery",

note = "Audio Mostly : Augmented and Participatory Sound/Music Experiences ; Conference date: 23-08-2017 Through 26-08-2017",

url = "http://audiomostly.com/",

}

RIS

TY - GEN

T1 - Perception of Paralinguistic Traits in Synthesized Voices

AU - Baird, Alice Emily

AU - Hasse Jørgensen, Stina

AU - Parada-Cabaleiro, Emilia

AU - Hantke, Simone

AU - Cummins, Nicholas

AU - Schuller, Bjorn

PY - 2017

Y1 - 2017

N2 - Along with the rise of artificial intelligence and the internet-of-things, synthesized voices are now common in daily–life, providing us with guidance, assistance, and even companionship. From formant to concatenative synthesis, the synthesized voice continues to be defined by the same traits we prescribe to ourselves. When the recorded voice is synthesized, does our perception of its new machine embodiment change, and can we consider an alternative, more inclusive form? To begin evaluating the impact of aesthetic design, this study presents a first–step perception test to explore the paralinguistic traits of the synthesized voice. Using a corpus of 13 synthesized voices, constructed from acoustic concatenative speech synthesis, we assessed the response of 23 listeners from differing cultural backgrounds. Evaluating if the perception shifts from the known ground–truths, we asked listeners to assigned traits of age, gender, accent origin, and human–likeness. Results present a difference in perception for age and human–likeness across voices, and a general agreement across listeners for both gender and accent origin. Connections found between age, gender and human–likeness call for further exploration into a more participatory and inclusive synthesized vocal identity.

AB - Along with the rise of artificial intelligence and the internet-of-things, synthesized voices are now common in daily–life, providing us with guidance, assistance, and even companionship. From formant to concatenative synthesis, the synthesized voice continues to be defined by the same traits we prescribe to ourselves. When the recorded voice is synthesized, does our perception of its new machine embodiment change, and can we consider an alternative, more inclusive form? To begin evaluating the impact of aesthetic design, this study presents a first–step perception test to explore the paralinguistic traits of the synthesized voice. Using a corpus of 13 synthesized voices, constructed from acoustic concatenative speech synthesis, we assessed the response of 23 listeners from differing cultural backgrounds. Evaluating if the perception shifts from the known ground–truths, we asked listeners to assigned traits of age, gender, accent origin, and human–likeness. Results present a difference in perception for age and human–likeness across voices, and a general agreement across listeners for both gender and accent origin. Connections found between age, gender and human–likeness call for further exploration into a more participatory and inclusive synthesized vocal identity.

KW - Faculty of Humanities

KW - user studies

KW - human-centered computing

U2 - 10.1145/3123514.3123528

DO - 10.1145/3123514.3123528

M3 - Article in proceedings

BT - Proceedings of the 12th International Audio Mostly Conference

PB - Association for Computing Machinery

CY - New York

T2 - Audio Mostly

Y2 - 23 August 2017 through 26 August 2017

ER -

ID: 195758688

Niels Bohr Institute