Clone detection through srcClone: A program slicing based approach

Journal of Systems and Software(2022)

引用 6|浏览20
暂无评分
摘要
Software clone detection is an often used approach to understand and maintain software systems. One category of clones that is challenging to detect but very useful is semantic clones, which are similar in semantics but differ in syntax significantly. Semantic clone detectors have trouble scaling to larger systems and sometimes struggle with recall and precision. To address this, we developed a slice-based scalable approach that detects both syntactic and semantic code clones, srcClone. srcClone ascertains code segment similarity by assessing the similarity of their corresponding program slices. We employ a lightweight, publicly-available, scalable program slicer within our clone detection approach. Using dependency analysis to detect and assess cloned components, we discover insights about code components that can be affected by a clone pair or set. These elements are critical in impact analysis. It can also be used by program analysts to run on non-compilable and incomplete source code, which serves comprehension and maintenance tasks very well. We first evaluate srcClone by comparing it to six state-of-the-art tools and two additional semantic clone detectors in performance, recall, and precision. We use the BigCloneBench real clones benchmark to facilitate comparison. We use an additional 16 established benchmark scenarios to perform a qualitative comparison between srcClone and 44 clone detection approaches in their capabilities to detect these scenarios. To further measure scalability, we evaluate srcClone on 191 versions of Linux kernel, containing approximately 87 MLOC. In our evaluations, we illustrate our approach is both relatively scalable and accurate. While its precision is slightly less than some other techniques, it makes up for it in higher recall including semantic clones unable to be found by any existing techniques. We contend our approach is an important advancement in software cloning that it is both demonstrably scalable and can detect more types of clones than existing work, thus providing developers greater information into their software.
更多
查看译文
关键词
Code clone,Clone detection,Program slicing,Semantic clones
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要