Few Shot Learning Guided by Emotion Distance for Cross-corpus Speech Emotion Recognition

2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC(2023)

引用 0|浏览1
暂无评分
摘要
Cross-corpus speech emotion recognition (SER) is important for building robust and practical SER systems that can adapt to various real-world scenarios and acoustic conditions. Cross-corpus SER faces the problem of low accuracy partly due to the scarcity and inconsistency of labeled data in the target corpus, in which situation prior knowledge about the relationship between emotion categories may be helpful. Previous studies have suggested that discrete emotion categories and continuous emotion space can complement each other in describing emotions. In this paper, we propose to use the distance between emotion categories derived from their distribution in the continuous emotion space as prior knowledge for cross-corpus speech emotion recognition. We hypothesize that this prior knowledge can help the SER system to learn more meaningful and generalizable representations of emotions that are consistent across domains. Experiment results show that the proposed few-shot learning method based on metric learning leveraging the prior knowledge of emotion distance achieves good performance.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要