Domain Adaptation of Low-Resource Target-Domain Models Using Well-Trained ASR Conformer Models

2022 IEEE Spoken Language Technology Workshop (SLT)（2023）

引用 2|浏览1

暂无评分

摘要

In encoder-decoder framework for Automatic Speech Recognition (ASR) systems, the decoder of the well-trained ASR model is largely tuned towards the source-domain, hurting the performance of target-domain models in vanilla transfer-learning. On the other hand, the encoder layers of the well-trained ASR model mainly capture the acoustic characteristics. In this paper, the embeddings tapped from the encoder layers of a well-trained ASR model are used as features for domain adaptation of a downstream low resource Conformer target-domain model. We do ablation studies on optimal encoder layers for tapping embeddings and the effect of freezing or updating the well-trained ASR model's encoder layers. Lastly, the application of Spectral Augmentation (SpecAug) on the proposed features improves the target-domain performance further. The proposed method reports an average relative improvement of ~40% over baseline with different source-domain model and target-domain Conformer model combinations.

查看译文

关键词

Domain adaptation,Automatic Speech Recognition (ASR),Pre-trained model,Low-resource speech recognition,Feature Extraction

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要