Using Audio Transformations To Improve Comprehension In Voice Question Answering

EXPERIMENTAL IR MEETS MULTILINGUALITY, MULTIMODALITY, AND INTERACTION (CLEF 2019)(2019)

引用 10|浏览55
暂无评分
摘要
Many popular form factors of digital assistants-such as Amazon Echo or Google Home-enable users to converse with speech-based systems. The lack of screens presents unique challenges. To satisfy users' information needs, the presentation of answers has to be optimized for voice-only interactions. We evaluate the usefulness of audio transformations (i.e., prosodic modifications) for voice-only question answering. We introduce a crowdsourcing setup evaluating the quality of our proposed modifications along multiple dimensions corresponding to the informativeness, naturalness, and ability of users to identify key parts of the answer. We offer a set of prosodic modifications that highlight potentially important parts of the answer using various acoustic cues. Our experiments show that different modifications lead to better comprehension at the expense of slightly degraded naturalness of the audio.
更多
查看译文
关键词
Speech generation, Question answering, Crowdsourcing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要