Attention based gender and nationality information exploration for speaker identification

DIGITAL SIGNAL PROCESSING(2022)

引用 3|浏览6
暂无评分
摘要
Gender and nationality information has not been exploited in large-scale speaker recognition despite being provided in the popular VoxCeleb1 dataset. This paper explores methods that combine high -level features extracted from the gender and nationality information with low-level acoustic features for speaker identification. To our knowledge, this is the first time that the gender and nationality information provided in VoxCeleb1 is utilized in speaker identification. Specifically, we propose Gender-Guided Spectrogram-Attention network and Nationality-Guided Spectrogram-Attention network that embed gender and nationality information into the spectrogram features, respectively. The resulting gender and nationality embeddings are then used with the spectrogram features together for classification. Experimental results show that the proposed methods can successfully capture the gender and nationality information of the speakers, and can effectively improve speaker identification accuracy. (C)& nbsp;2022 Elsevier Inc. All rights reserved.
更多
查看译文
关键词
Speaker identification, Gender, Nationality, High-level features, Attention network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要