Korean Grapheme Unit-based Speech Recognition Using Attention-CTC Ensemble Network

2019 International Symposium on Multimedia and Communication Technology (ISMAC)(2019)

引用 4|浏览53
暂无评分
摘要
This study proposes an end-to-end speech recognition method based on the Attention-CTC ensemble network that uses Korean graphemes as recognition units. End-to-end speech recognition is a method that allows the processing of procedures that involved a number of modules, including the DNN-HMM-based acoustic model, the N-gram-based language model, and the WFST-based decoding network, with a single DNN network. To predict the outputs of the end-to-end model, this study utilizes grapheme-unit output structures. Building a network based on graphemes enables effective learning by reducing the number of output parameters to be predicted from 11,172 to 49. Towards this aim, this study designed an end-to-end model by combining the connectionist temporal classification (CTC), the DNN network structure primarily used in end-to-end learning, and the attention network model. The experiment resulted in a 10.5% syllable error rate.
更多
查看译文
关键词
Attention network,Connectionist temporal classification,Speech recognition,Deep neural network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要