The Larger the Better? Improved LLM Code-Generation via Budget Reallocation
arxiv(2024)
摘要
It is a common belief that large language models (LLMs) are better than
smaller-sized ones. However, larger models also require significantly more time
and compute during inference. This begs the question: what happens when both
models operate under the same budget? (e.g., compute, run-time). To address
this question, we analyze code generation LLMs of various sizes and make
comparisons such as running a 70B model once vs. generating five outputs from a
13B model and selecting one. Our findings reveal that, in a standard unit-test
setup, the repeated use of smaller models can yield consistent improvements,
with gains of up to 15
where unit-tests are unavailable, a ranking-based selection of candidates from
the smaller model falls short of the performance of a single output from larger
ones. Our results highlight the potential of using smaller models instead of
larger ones, and the importance of studying approaches for ranking LLM outputs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要