谷歌浏览器插件
订阅小程序
在清言上使用

Software Vulnerability Prediction in Low-Resource Languages: an Empirical Study of CodeBERT and ChatGPT

PROCEEDINGS OF 2024 28TH INTERNATION CONFERENCE ON EVALUATION AND ASSESSMENT IN SOFTWARE ENGINEERING, EASE 2024(2024)

引用 0|浏览10
暂无评分
摘要
Background: Software Vulnerability (SV) prediction in emerging languages isincreasingly important to ensure software security in modern systems. However,these languages usually have limited SV data for developing high-performingprediction models. Aims: We conduct an empirical study to evaluate the impactof SV data scarcity in emerging languages on the state-of-the-art SV predictionmodel and investigate potential solutions to enhance the performance. Method:We train and test the state-of-the-art model based on CodeBERT with and withoutdata sampling techniques for function-level and line-level SV prediction inthree low-resource languages - Kotlin, Swift, and Rust. We also assess theeffectiveness of ChatGPT for low-resource SV prediction given its recentsuccess in other domains. Results: Compared to the original work in C/C++ withlarge data, CodeBERT's performance of function-level and line-level SVprediction significantly declines in low-resource languages, signifying thenegative impact of data scarcity. Regarding remediation, data samplingtechniques fail to improve CodeBERT; whereas, ChatGPT showcases promisingresults, substantially enhancing predictive performance by up to 34.4function level and up to 53.5highlighted the challenge and made the first promising step for low-resource SVprediction, paving the way for future research in this direction.
更多
查看译文
关键词
Software vulnerability,Software security,Large language models,ChatGPT,Empirical study
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要