Improving multilingual speech emotion recognition by combining acoustic features in a three-layer model

Speech Communication（2019）

引用 93|浏览13

暂无评分

摘要

•We study multilingual speech emotion recognition (mSER) by combined acoustic features in a three-layer perceptual emotion model.•We analyze three vital issues: 1) robust features to mSER; 2) impact of speaker normalization (SN); (3) generalization of mSER to a new language.•Prosody and modulation spectrum features are studied. Z-normalization forms SN. Cross-speaker and -corpus tasks enhance the robustness of mSER.•The proposed mSER model outperforms previous works. Notably, it allows a comparable result to monolingual SER in a new language without training.

查看译文

关键词

Multilingual emotion recognition,Human emotional perception,Emotional space,Three-layer model

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要