Improving multilingual speech emotion recognition by combining acoustic features in a three-layer model

Speech Communication(2019)

引用 93|浏览13
暂无评分
摘要
•We study multilingual speech emotion recognition (mSER) by combined acoustic features in a three-layer perceptual emotion model.•We analyze three vital issues: 1) robust features to mSER; 2) impact of speaker normalization (SN); (3) generalization of mSER to a new language.•Prosody and modulation spectrum features are studied. Z-normalization forms SN. Cross-speaker and -corpus tasks enhance the robustness of mSER.•The proposed mSER model outperforms previous works. Notably, it allows a comparable result to monolingual SER in a new language without training.
更多
查看译文
关键词
Multilingual emotion recognition,Human emotional perception,Emotional space,Three-layer model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要