Emotion Recognition Using Imperfect Speech Recognition

Florian Metze,Anton Batliner,Florian Eyben,Tim Polzehl,Bjoern Schuller,Stefan Steidl

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2（2010）

引用 38|浏览20

暂无评分

摘要

This paper investigates the use of speech-to-text methods for assigning an emotion class to a given speech utterance. Previous work shows that an emotion extracted from text can convey complementary evidence to the information extracted by classifiers based on spectral, or other non-linguistic features. As speech-to-text usually presents significantly more computational effort, in this study we investigate the degree of speech-to-text accuracy needed for reliable detection of emotions from an automatically generated transcription of an utterance. We evaluate the use of hypotheses in both training and testing, and compare several classification approaches on the same task. Our results show that emotion recognition performance stays roughly constant as long as word accuracy doesn't fall below a reasonable value, making the use of speech-to-text viable for training of emotion classifiers based on linguistics.

查看译文

关键词

speech-to-text,emotion detection,meta-data extraction,rich transcription,children's speech

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要