Machine Learning Techniques for Escaped Defect Analysis in Software Testing

Lidia P. G. Nascimento,Ricardo B. C. Prudencio,Alexandre C. Mota, Audir A. Paiva Filho, Pedro H. A. Cruz, Daniel C. C. A. de Oliveira, Pedro R. S. Moreira

PROCEEDINGS OF THE 8TH BRAZILIAN SYMPOSIUM ON SYSTEMATIC AND AUTOMATED SOFT-WARE TESTING, SAST 2023（2023）

引用 0|浏览0

暂无评分

摘要

Software testing is crucial to ensure the quality of a software under development. Once a potential bug is identified, a Bug Report (BR) is opened with information to describe and reproduce the found issue. Usually in big companies, hundreds of BRs are opened weekly by different testing teams, which have to be inspected and fixed adequately. This paper is focused on the use of Machine Learning (ML) techniques to automate the Escaped Defect Analysis (EDA), which is an important (but expensive) task to improve the effectiveness of the testing teams. In our work, Escaped Defects (EDs) are bugs or issues that should have been opened by a specific team, but which was accidentally found by another team. The occurrence of EDs is risky, as it is usually related to failures in the testing activities. EDA is usually performed manually by software engineers, who read each BR's textual content to judge whether it is an ED or not. This is challenging and time-consuming. In our solution, the BR's content is preprocessed by textual operations and then a feature representation is adopted by a ML classifier to return the probability of EDA labels. Experiments were performed in a dataset of 3767 BRs provided by the Motorola Mobility Comercio de Produtos Eletronicos Ltda. Different ML algorithms were adopted to build classifiers, obtaining high AUC values (usually higher than 0.8), in a cross-validation experiment. This result indicates a good trade-off between the number of EDs correctly identified and the number of BRs that have to be actually inspected in the EDA process. This paper presents a ML based approach to classify escaped defects described in bug reports. EDs are bugs missed by the QA team in charge and happened to be uncovered by a different team. To automate the identification of EDs (a costly and error-prone task), a dataset of a partner company is leveraged, text processing operators are adopted for feature engineering and 6 classical ML algorithms are applied. The results show satisfactory accuracy and AUC and the experiments indicate a good trade-off between the number of EDs correctly identified and the number of BRs that have to be inspected in the EDA.

查看译文

关键词

Escaped Defect Analysis,Bug Reports,Machine Learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要