JumpCoder: Go Beyond Autoregressive Coder via Online Modification
CoRR(2024)
摘要
While existing code large language models (code LLMs) exhibit impressive
capabilities in code generation, their autoregressive sequential generation
inherently lacks reversibility. This limitation hinders them from timely
correcting previous missing statements during coding as humans do, often
leading to error propagation and suboptimal performance. We introduce
JumpCoder, a novel modelagnostic framework that enables online modification and
non-sequential generation to augment the code LLMs. The key idea behind
JumpCoder is to insert new code into the currently generated code when
necessary during generation, which is achieved through an auxiliary infilling
model that works in tandem with the code LLM. Since identifying the best infill
position beforehand is intractable, we adopt an infill-first, judge-later
strategy, which experiments with filling at the k most critical positions
following the generation of each line, and uses an Abstract Syntax Tree (AST)
parser alongside the Generation Model Scoring to effectively judge the validity
of each potential infill. Extensive experiments using six state-of-the-art code
LLMs across multiple benchmarks consistently indicate significant improvements
over all baselines. Notably, JumpCoder assists code LLMs in achieving up to a
3.6
multilingual HumanEval benchmarks. Our code is public at
https://github.com/Keytoyze/JumpCoder.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要