Rakuten's Participation in WAT 2022: Parallel Dataset Filtering by Leveraging Vocabulary Heterogeneity.
International Conference on Computational Linguistics(2022)
摘要
This paper introduces our neural machine translation system’s participation in the WAT 2022 shared translation task (team ID: sakura). We participated in the Parallel Data Filtering Task. Our approach based on Feature Decay Algorithms achieved +1.4 and +2.4 BLEU points for English to Japanese and Japanese to English respectively compared to the model trained on the full dataset, showing the effectiveness of FDA on in-domain data selection.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要