Scaling Laws Behind Code Understanding Model
CoRR(2024)
摘要
The scaling law is becoming a fundamental law in many machine learning areas.
That is, test error falls off with the power law when increasing training data,
model size, and computing resource. However, whether this law is suitable for
the task of code understanding is not well studied, and most current language
models for code understanding are about 100M parameters, which are relatively
"small" compared to large language models. In this paper, we conduct extensive
experiments to investigate the scaling law for the code understanding task by
varying training data, model size, and computing resource. We validate that the
test error of code understanding models falls off with the power law when using
larger models, indicating that the scaling law is suitable for the code
understanding task. Besides, we apply different scales of models to two
downstream code understanding tasks, and find that the performance increases
with larger scale of models. Finally, we train a large-scale code understanding
model named CoLSBERT with 1.5B parameters on a large dataset using more
computing resource, which outperforms previous work by a large margin. We will
release our code and the CoLSBERT model when our paper is published.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要