FT2Ra: A Fine-Tuning-Inspired Approach to Retrieval-Augmented Code Completion
CoRR(2024)
摘要
The rise of code pre-trained models has significantly enhanced various coding
tasks, such as code completion, and tools like GitHub Copilot. However, the
substantial size of these models, especially large models, poses a significant
challenge when it comes to fine-tuning them for specific downstream tasks. As
an alternative approach, retrieval-based methods have emerged as a promising
solution, augmenting model predictions without the need for fine-tuning.
Despite their potential, a significant challenge is that the designs of these
methods often rely on heuristics, leaving critical questions about what
information should be stored or retrieved and how to interpolate such
information for augmenting predictions.
To tackle this challenge, we first perform a theoretical analysis of the
fine-tuning process, highlighting the importance of delta logits as a catalyst
for improving model predictions. Building on this insight, we develop a novel
retrieval-based method, FT2Ra, which aims to mimic genuine fine-tuning. While
FT2Ra adopts a retrieval-based mechanism, it uniquely adopts a paradigm with a
learning rate and multi-epoch retrievals, which is similar to fine-tuning.In
token-level completion, which represents a relatively easier task, FT2Ra
achieves a 4.29
on UniXcoder. In the more challenging line-level completion task, we observe a
substantial more than twice increase in Exact Match (EM) performance,
indicating the significant advantages of our theoretical analysis. Notably,
even when operating without actual fine-tuning, FT2Ra exhibits competitive
performance compared to the models with real fine-tuning.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要