Repairing Security Vulnerabilities Using Pre-trained Programming Language Models

Kai Huang,Su Yang,Hongyu Sun,Chengyi Sun, Xuejun Li,Yuqing Zhang

2022 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W)（2022）

引用 2|浏览35

暂无评分

摘要

Repairing software bugs with automated solutions is a long-standing goal of researchers. Some of the latest automated program repair (APR) tools leverage natural language processing (NLP) techniques to repair software bugs. But natural languages (NL) and programming languages (PL) have significant differences, which leads to the fact that they may not be able to handle PL tasks well. Moreover, due to the difference between the vulnerability repair task and bug repair task, the performance of these tools on vulnerability repair is not yet known. To address these issues, we attempt to use large-scale pre-trained PL models (CodeBERT and GraphCodeBERT) for the vulnerability repair task based on the characteristics of PL and explore the real-world performance of the state-of-the-art data-driven approaches for vulnerability repair. The results show that using pre-trained PL models can better capture and process PL features and accomplish multi-line vulnerability repair. Specifically, our solution achieves advanced results (single-line repair accuracy 95.47%, multi-line repair accuracy 90.06%). These results outperform the state-of-the-art data-driven approaches and demonstrate that adding rich data-dependent features can help solve more complex code repair problems. Besides, we also discuss the previous work and our approach, pointing out some shortcomings and solutions we can work on in the future.

查看译文

关键词

Automated program repair,Vulnerability repair,Programming language model

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要