Shibboleth: Hybrid Patch Correctness Assessment in Automated Program Repair.

ASE(2022)

引用 0|浏览15
暂无评分
摘要
Test-based generate-and-validate automated program repair (APR) systems generate many patches that pass the test suite without fixing the bug. The generated patches must be manually inspected by the developers, a task that tends to be time-consuming, thereby diminishing the role of APR in reducing debugging costs. We present the design and implementation of a novel tool, named Shibboleth, for automatic assessment of the patches generated by test-based generate-and-validate APR systems. Shibboleth leverages lightweight static and dynamic heuristics from both test and production code to rank and classify the patches. Shibboleth is based on the idea that the buggy program is almost correct and the bugs are small mistakes that require small changes to fix and specifically the fix does not remove the code implementing correct functionality of the program. Thus, the tool measures the impact of patches on both production code (via syntactic and semantic similarity) and test code (via code coverage) to separate the patches that result in similar programs and that do not remove desired program elements. We have evaluated Shibboleth on 1,871 patches, generated by 29 Java-based APR systems for Defects4J programs. The technique outperforms state-of-the-art raking and classification techniques. Specifically, in our ranking data set, in 66% of the cases, Shibboleth ranks the correct patch in top-1 or top-2 positions and, in our classification data set, it achieves an accuracy and F1-score of 0.887 and 0.852, respectively, in classification mode. A demo video of the tool is available at https://bit.ly/3NvYJN8.
更多
查看译文
关键词
Automated Program Repair, Patch Correctness Assessment, Similarity, Branch Coverage
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要