Incremental Vocabularies in Machine Translation Through Aligned Embedding Projections.

Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA)(2022)

引用 0|浏览8
暂无评分
摘要
The vocabulary of a neural machine translation (NMT) model is often one of its most critical components since it defines the numerical inputs that the model will receive. Because of this, replacing or modifying a model's vocabulary usually involves re-training the model to adjust its weights to the new embeddings. In this work, we study the properties that pre-trained embeddings must have in order to use them to extend the vocabulary of pre-trained NMT models in a zero-shot fashion. Our work shows that extending vocabularies for pre-trained NMT models to perform zero-shot translation is possible, but this requires the use of aligned, high-quality embeddings adapted to the model's domain.
更多
查看译文
关键词
Neural machine translation,Zero-shot translation,Continual learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要