谷歌浏览器插件
订阅小程序
在清言上使用

Disentanglement of Speaker Identity for Accented Speech Recognition

Yifan Wang,Hongjie Gu,Ran Shen,Yiling Li, Weihao Jiang,Junjie Huang

2023 8th International Conference on Communication, Image and Signal Processing (CCISP)(2023)

引用 0|浏览0
暂无评分
摘要
Automatic speech recognition (ASR) systems have achieved good performance on standard English speech. However, recognition accuracy significantly degrades for non-native speakers with accented speech. Existing multi-accent ASR models utilize accent embeddings as extra input to handle diverse accents, but these embeddings often intertwine other speech attributes, impeding model performance. In this paper, we propose a speaker identity disentangled accent modeling approach, which extracts the accent embedding without any speaker-specific information, and uses it to improve the performance of the multi-accent speech recognition systems. Experiments conducted on the CommonVoice dataset demonstrate that our proposed method attains a 15% reduction in word error rate (WER) over baselines.
更多
查看译文
关键词
accented speech recognition,multi-accent,speaker identity,disentanglement
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要