Domain Adaptation of Low-Resource Target-Domain Models Using Well-Trained ASR Conformer Models

2022 IEEE Spoken Language Technology Workshop (SLT)(2023)

引用 2|浏览1
暂无评分
摘要
In encoder-decoder framework for Automatic Speech Recognition (ASR) systems, the decoder of the well-trained ASR model is largely tuned towards the source-domain, hurting the performance of target-domain models in vanilla transfer-learning. On the other hand, the encoder layers of the well-trained ASR model mainly capture the acoustic characteristics. In this paper, the embeddings tapped from the encoder layers of a well-trained ASR model are used as features for domain adaptation of a downstream low resource Conformer target-domain model. We do ablation studies on optimal encoder layers for tapping embeddings and the effect of freezing or updating the well-trained ASR model's encoder layers. Lastly, the application of Spectral Augmentation (SpecAug) on the proposed features improves the target-domain performance further. The proposed method reports an average relative improvement of ~40% over baseline with different source-domain model and target-domain Conformer model combinations.
更多
查看译文
关键词
Domain adaptation,Automatic Speech Recognition (ASR),Pre-trained model,Low-resource speech recognition,Feature Extraction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要