Online Speaker Adaptation for LVCSR Based on Attention Mechanism

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)(2018)

引用 9|浏览56
暂无评分
摘要
Speaker adaptation is one of the most popular and important topics for speech recognition. In this paper, we propose a novel online speaker adaptation technique for deep neural networks based large vocabulary automatic speech recognition (LVCSR). In this approach, the i-vectors of the speakers in training set are extracted as a static memory. For each frame, attention mechanism is used to select the most relevant speaker i-vectors to the current speech segment from the memory. We also propose a new attention mechanism to improve the performance. The vectors obtained by the attention mechanism provide speaker information for improving the accuracy of speech recognition. Experiments on the Switchboard task show that the proposed approach achieves a relative 8.3% word error rate (WER) reduction over speaker independent model without any adaptation data. The result is comparable to that of the popular i-vector based offline speaker adaption method and is much better than that of the i-vector based online speaker adaption method.
更多
查看译文
关键词
Training,Adaptation models,Hidden Markov models,Acoustics,Task analysis,Data models,Feature extraction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要