Incremental Vocabularies in Machine Translation Through Aligned Embedding Projections.

Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA)（2022）

引用 0|浏览8

暂无评分

摘要

The vocabulary of a neural machine translation (NMT) model is often one of its most critical components since it defines the numerical inputs that the model will receive. Because of this, replacing or modifying a model's vocabulary usually involves re-training the model to adjust its weights to the new embeddings. In this work, we study the properties that pre-trained embeddings must have in order to use them to extend the vocabulary of pre-trained NMT models in a zero-shot fashion. Our work shows that extending vocabularies for pre-trained NMT models to perform zero-shot translation is possible, but this requires the use of aligned, high-quality embeddings adapted to the model's domain.

查看译文

关键词

Neural machine translation,Zero-shot translation,Continual learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要