Leveraging Simultaneous Translation for Enhancing Transcription of Low-resource Language via Cross Attention Mechanism

Conference of the International Speech Communication Association (INTERSPEECH)(2022)

引用 0|浏览38
暂无评分
摘要
This work addresses automatic speech recognition (ASR) of a low-resource language using a translation corpus, which includes the simultaneous translation of the low-resource language. In multi-lingual events such as international meetings and court proceedings, simultaneous interpretation by a human is often available for speeches of low-resource languages. In this setting, we can assume that the content of its back-translation is the same as the transcription of the original speech. Thus, the former is expected to enhance the later process. We formulate this framework as a joint process of ASR and machine translation (MT) and implement it with a combination of cross attention mechanisms of the ASR encoder and the MT encoder. We evaluate the proposed method using the spoken language translation corpus of the Extraordinary Chambers in the Courts of Cambodia (ECCC), achieving a significant improvement in the ASR word error rate (WER) of Khmer by 8.9% relative. The effectiveness is also confirmed in the Fisher-CallHome-Spanish corpus with the reduction of WER in Spanish by 1.7% relative.1
更多
查看译文
关键词
automatic speech recognition, machine translation, multi-lingual corpus, low-resource language, Khmer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要