Age Regression for Human Voices

Martin T. Schorradt,Douglas Cunningham

Multimodal Interfaces and Machine Learning for Multimodal Interaction(2022)

引用 0|浏览5
暂无评分
摘要
ABSTRACT The human voice is one of our most important tools for communicating with other people. Besides pure semantic meaning it also conveys syntactical information such as emphasis as well as personal information such as emotional state, gender, and age. While the physical changes that occur to a person’s voice are well studied, there is surprisingly little work on the perception of those changes. To hold the range of subtleties present in a given utterance constant and thus focus on the changes caused by age, this paper takes adult recordings (three males, and three females) and artificially resynthesizes them (using values from measurements of real children’s voices) to create a childlike versions of the utterance at different target ages. In particular, we focus on a systematic, factorial combination pitch shifting and formant shifting. To get an insight about the influence of these factors on the estimated age, we performed a perceptual experiment. Since the resynthesis method we used can produce a wide range of voices, not all of which are physically consistent, we also asked the participants to rate how natural the voices sounded. Furthermore, since former studies suggest that people are not able to distinguish between males and females of young ages, participants were also asked to rate how male or female the voices sounded. Overall, we found that although the synthesis method produced physically plausible signals (compared average values for real children), the degree of signal manipulation was correlated with perceived unnaturalness. We also found that pitch shift had only a small affect on perceived age, that formant shift had a strong affect on perceived age, and that these effects depended on the original gender of the recording. As expected, people had difficulty guessing the gender of younger sounding voices.
更多
查看译文
关键词
Age Synthesis, Age Regression, Speech Synthesis, Speech Processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要