LARGE MARGIN TRAINING IMPROVES LANGUAGE MODELS FOR ASR

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021)(2021)

引用 1|浏览19
暂无评分
摘要
Language models (LM) have been widely deployed in modern ASR systems. The LM is often trained by minimizing its perplexity on speech transcript. However, few studies try to discriminate a "gold" reference against inferior hypotheses. In this work, we propose a large margin language model (LMLM). LMLM is a general framework that enforces an LM to assign a higher score to the "gold" reference, and a lower one to the inferior hypothesis. The general framework is applied to three pretrained LM architectures: left-to-right LSTM, transformer encoder, and transformer decoder. Results show that LMLM can significantly outperform traditional LMs that are trained by minimizing perplexity. Especially for challenging noisy cases. Finally, among the three architectures, transformer encoder achieves the best performance.
更多
查看译文
关键词
Large Margin, Transformer, ASR, WER
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要