Multi-task Pre-training for Lhasa-Tibetan Speech Recognition

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT IX(2023)

引用 0|浏览11
暂无评分
摘要
Compared to mainstream languages such as Chinese and English, Tibetan speech corpus is limited. Pre-training technology can improve the speech recognition performance for low-resource language by using multiple languages corpus, which involves initially training a neural network on the multi-language dataset, followed by fine-tuning the trained model on low-resource language. In this paper, a multi-task serial pre-training method is proposed to address the limited resources in Tibetan speech recognition. By designing the number and order of tasks in the pre-training process, better recognition performance can be achieved. The experiments on the Lhasa-Tibetan speech recognition task show that our proposed method is significantly superior to the baseline model, achieving a Tibetan word error rate of 4.12%, which is a 9.34% reduction compared to the baseline model and 1.06% lower compared to the existing pre-training model.
更多
查看译文
关键词
Lhasa-Tibetan speech recognition,Multi-task,Serial Pre-training
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要