Music Theory-Inspired Acoustic Representation for Speech Emotion Recognition.

IEEE ACM Trans. Audio Speech Lang. Process.(2023)

引用 0|浏览21
暂无评分
摘要
This research presents a music theory-inspired acoustic representation (hereafter, MTAR) to address improved speech emotion recognition. The recognition of emotion in speech and music is developed in parallel, yet a relatively limited understanding of MTAR for interpreting speech emotions is involved. In the present study, we use music theory to study representative acoustics associated with emotion in speech from vocal emotion expressions and auditory emotion perception domains. In experiments assessing the role and effectiveness of the proposed representation in classifying discrete emotion categories and predicting continuous emotion dimensions, it shows promising performance compared with extensively used features for emotion recognition based on the spectrogram, Mel-spectrogram, Mel-frequency cepstral coefficients, VGGish, and the large baseline feature sets of the INTERSPEECH challenges. This proposal opens up a novel research avenue in developing a computational acoustic representation of speech emotion via music theory.
更多
查看译文
关键词
acoustic representation,music,recognition,theory-inspired
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要