Speech Emotion Recognition using MFCC, GFCC, Chromagram and RMSE features

international conference on signal processing(2021)

引用 8|浏览4
暂无评分
摘要
In recent years, increasing attention is given to the research of the emotions present in speech. Various systems are developed aiming to detect the emotions in the speaker’s statements. One of the biggest differences between a machine and a human is understanding the emotions of others and behaving accordingly. Researchers are working on bridging this gap by recognizing emotions in speech or voice. This paper proposes a deep learning-based technique for speech emotion recognition (SER). The SER system is based on various techniques that use distinguished modules for emotion recognition. The model differentiates emotions such as neutral state, happiness, sadness, anger, surprise, etc. The performance of the classification system is based on features extracted and generated models. The features utilized in this include energy, pitch, chromagram, mel-frequency spectrum coefficients (MFCC), and Gammatone frequency spectrum coefficients (GFCC). The emotions are classified using a two dimentional Convolutional Neural Network (CNN). The classification model achieved an overall accuracy of 92.59% on the test data which is comparatively better than the previous algorithm. In future, the intention is to increase the system performance and detect more emotions.
更多
查看译文
关键词
Emotion recognition,Feature extraction,Cepstrum,Gammatone Filters,Glottal waveform,CNN
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要