Max Margin Cosine Loss for Speaker Identification on Short Utterances

2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP)(2018)

引用 1|浏览67
Speaker identification has made extraordinary progress owing to the advancement of deep neural networks. Speaker feature discrimination is a vital term in speaker recognition. However, the traditional softmax loss usually lacks the power of discrimination. To address this problem, this paper explores a novel loss function, namely max margin cosine loss (MMCL). To be specific, we realize the function by L2 normalizing both features and weight vectors in the softmax loss, together with a cosine margin term to maximize the decision margin in the angular space. In addition, max margin constraint, as one regularization term, is incorporated into the proposed loss function. Experimental results demonstrate the effectiveness of our proposed max margin cosine loss and superiority over pervious losses. For example, on 2s condition, MMCL reduces the equal error rate by 10.63% relatively compared to additive angular margin cosine loss (AMCL), while AMCL has already obtained 6.37% relative reduction than softmax loss. 1
Training,Task analysis,Additives,Convolution,Face recognition,Recurrent neural networks
AI 理解论文
Chat Paper