Automatic Recognition Of Self-Reported And Perceived Emotion: Does Joint Modeling Help?

ICMI-MLMI(2016)

引用 25|浏览440
暂无评分
摘要
Emotion labeling is a central component of automatic emotion recognition. Evaluators are asked to estimate the emotion label given a set of cues, produced either by themselves (self-report label) or others (perceived label). This process is complicated by the mismatch between the intentions of the producer and the interpretation of the perceiver. Traditionally, emotion recognition systems use only one of these types of labels when estimating the emotion content of data. In this paper, we explore the impact of jointly modeling both an individual's self-report and the perceived label of others. We use deep belief networks (DBN) to learn a representative feature space, and model the potentially complementary relationship between intention and perception using multi-task learning. We hypothesize that the use of DBN feature-learning and multi-task learning of selfreport and perceived emotion labels will improve the performance of emotion recognition systems. We test this hypothesis on the IEMOCAP dataset, an audio-visual and motioncapture emotion corpus. We show that both DBN feature learning and multi-task learning off er complementary gains. The results demonstrate that the perceived emotion tasks see greatest performance gain for emotionally subtle utterances, while the self-report emotion tasks see greatest performance gain for emotionally clear utterances. Our results suggest that the combination of knowledge from the selfreport and perceived emotion labels lead to more effective emotion recognition systems.
更多
查看译文
关键词
Audio-visual emotion recognition,self-reported emotion,perceived emotion,multi-task learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要