Disentanglement of Speaker Identity for Accented Speech Recognition

Yifan Wang,Hongjie Gu,Ran Shen,Yiling Li, Weihao Jiang,Junjie Huang

2023 8th International Conference on Communication, Image and Signal Processing (CCISP)（2023）

引用 0|浏览0

暂无评分

摘要

Automatic speech recognition (ASR) systems have achieved good performance on standard English speech. However, recognition accuracy significantly degrades for non-native speakers with accented speech. Existing multi-accent ASR models utilize accent embeddings as extra input to handle diverse accents, but these embeddings often intertwine other speech attributes, impeding model performance. In this paper, we propose a speaker identity disentangled accent modeling approach, which extracts the accent embedding without any speaker-specific information, and uses it to improve the performance of the multi-accent speech recognition systems. Experiments conducted on the CommonVoice dataset demonstrate that our proposed method attains a 15% reduction in word error rate (WER) over baselines.

查看译文

关键词

accented speech recognition,multi-accent,speaker identity,disentanglement

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要