Data clone detection and visualization in spreadsheets

Felienne Hermans,Ben Sedee,Martin Pinzger,Arie van Deursen

Software Engineering（2013）

引用 84|浏览367

暂无评分

摘要

Spreadsheets are widely used in industry: it is estimated that end-user programmers outnumber programmers by a factor 5. However, spreadsheets are error-prone, numerous companies have lost money because of spreadsheet errors. One of the causes for spreadsheet problems is the prevalence of copy-pasting. In this paper, we study this cloning in spreadsheets. Based on existing text-based clone detection algorithms, we have developed an algorithm to detect data clones in spreadsheets: formulas whose values are copied as plain text in a different location. To evaluate the usefulness of the proposed approach, we conducted two evaluations. A quantitative evaluation in which we analyzed the EUSES corpus and a qualitative evaluation consisting of two case studies. The results of the evaluation clearly indicate that 1) data clones are common, 2) data clones pose threats to spreadsheet quality and 3) our approach supports users in finding and resolving data clones.

查看译文

关键词

data clone,data clone detection,different location,spreadsheet error,qualitative evaluation,spreadsheet quality,spreadsheet problem,euses corpus,case study,quantitative evaluation,data visualization,code smells,clustering algorithms,data visualisation,algorithm design and analysis,software quality,cloning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要