Revisiting and Improving SZZ Implementations

Edmilson Campos Neto,Daniel Alencar da Costa,Uirá Kulesza

2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)（2019）

引用 22|浏览31

暂无评分

摘要

Background: The SZZ algorithm was proposed to identify bug-introducing changes, i.e., changes that are likely to induce bugs. Previous studies improved its implementation and evaluated its results.Aims: To address existing limitations of SZZ to improve the maturity of the algorithm. We also aim to verify if the improvements that have been proposed to the SZZ algorithm also hold in different datasets.Method: We re-evaluate two recent SZZ implementations using an adaptation of the Defects4J dataset, which works as a preprocessed dataset that can be used by SZZ. Furthermore, we revisit the limitations of RA-SZZ (refactoring aware SZZ) to improve the precision and recall of the algorithm.Results: We observe that a median of 44% of the lines that are flagged by the improved SZZ are very likely to introduce a bug. We manually analyze the SZZ-generated data and observe that there exist refactoring operations (31.17%) and equivalent changes (13.64%) that are still misidentified by the improved SZZ.Conclusion: By preprocessing the dataset that is used as input by SZZ, the accuracy of SZZ may be considerably improved. For example, we observe that SZZ implementations are approximately 40% more accurate if only valid bug-fix lines are used as the input for SZZ.

查看译文

关键词

SZZ algorithm,refactoring change,bug-introducing change,bug-fix change

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要