Compact Abstract Graphs for Detecting Code Vulnerability with GNN Models.

ACSAC(2022)

引用 0|浏览27
暂无评分
摘要
Source code representation is critical to the machine-learning-based approach to detecting code vulnerability. This paper proposes Compact Abstract Graphs (CAGs) of source code in different programming languages for predicting a broad range of code vulnerabilities with Graph Neural Network (GNN) models. CAGs make the source code representation aligned with the task of vulnerability classification and reduce the graph size to accelerate model training with minimum impact on the prediction performance. We have applied CAGs to six GNN models and large Java/C datasets with 114 vulnerability types in Java programs and 106 vulnerability types in C programs. The experiment results show that the GNN models have performed well, with accuracy ranging from 94.7% to 96.3% on the Java dataset and from 91.6% to 93.2% on the C dataset. The resultant GNN models have achieved promising performance when applied to more than 2,500 vulnerabilities collected from real-world software projects. The results also show that using CAGs for GNN models is significantly better than ASTs, CFGs (Control Flow Graphs), and PDGs (Program Dependence Graphs). A comparative study has demonstrated that the CAG-based GNN models can outperform the existing methods for machine learning-based vulnerability detection.
更多
查看译文
关键词
Software vulnerability, machine learning, graph neural networks, static code analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要