Datastore Distillation for Nearest Neighbor Machine Translation

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING(2024)

引用 0|浏览24
暂无评分
摘要
Nearest neighbor machine translation (i.e., kNN-MT) is a promising approach to enhance translation quality by equipping pre-trained neural machine translation (NMT) models with the nearest neighbor retrieval. Despite its great success, kNN-MT typically requires ample space to store its token-level datastore, causing kNN-MT to be less practical in edge devices or online scenarios. In this paper, inspired by the concept of knowledge distillation, we provide a new perspective to ease the storage overhead by datastore distillation, which is formalized as a constrained optimization problem. We further design a novel model-agnostic iterative nearest neighbor merging method for the datastore distillation problem to obtain an effective and efficient solution. Experiments on three benchmark datasets indicate that our approach not only reduces the volume of the datastore by up to 50% without significant performance degradation, but also outperforms other baselines by a large margin at the same compression rate. Another experiment conducted on WikiText-103 further demonstrates the effectiveness of our method in the language model task.
更多
查看译文
关键词
Nearest neighbor machine translation,datastore distillation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要