Multi-task Pre-training for Lhasa-Tibetan Speech Recognition

Yigang Liu,Yue Zhao,Xiaona Xu,Liang Xu, Xubei Zhang

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT IX（2023）

引用 0|浏览11

暂无评分

摘要

Compared to mainstream languages such as Chinese and English, Tibetan speech corpus is limited. Pre-training technology can improve the speech recognition performance for low-resource language by using multiple languages corpus, which involves initially training a neural network on the multi-language dataset, followed by fine-tuning the trained model on low-resource language. In this paper, a multi-task serial pre-training method is proposed to address the limited resources in Tibetan speech recognition. By designing the number and order of tasks in the pre-training process, better recognition performance can be achieved. The experiments on the Lhasa-Tibetan speech recognition task show that our proposed method is significantly superior to the baseline model, achieving a Tibetan word error rate of 4.12%, which is a 9.34% reduction compared to the baseline model and 1.06% lower compared to the existing pre-training model.

查看译文

关键词

Lhasa-Tibetan speech recognition,Multi-task,Serial Pre-training

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要