Leveraging Structure in Software Merge: An Empirical Study

Georg Seibt, Florian Heck,Guilherme Cavalcanti,Paulo Borba,Sven Apel

IEEE Transactions on Software Engineering(2021)

引用 7|浏览12
暂无评分
摘要
Large-scale software development today relies heavily on version control systems facilitating distributed development of software projects. For the purpose of merging diverging versions of the code base, version control systems employ line-based merge algorithms, which are applicable to all text files. Structured merge algorithms have been proposed as an alternative to unstructured, line-based merging, with the goal of reducing the number of merge conflicts that have to be manually resolved by the developer. By leveraging the structure inherent in source code (i.e., by representing source code files in terms of abstract syntax trees instead of sequences of text lines), these algorithms are able to merge revisions in various situations (e.g., reordering of methods) that would cause conflicts when merged using an unstructured approach. However, merging abstract syntax trees is inherently more complex than merging sequences of text lines, which makes structured merge algorithms computationally more expensive than an unstructured merge. To reduce the runtime cost of structured merge algorithms, semistructured merge as well as combinations of different merge strategies were proposed. As such, we observe a range of increasingly structured merge algorithms, which feature different characteristics in terms of conflict resolution and runtime. The progressively increasing use of structure to avoid merge conflicts or to automate conflict resolution raises a number of questions: How is the correctness of the code resulting from a merge affected when employing structured merge algorithms? Which algorithm strikes the best balance between runtime, conflict resolution potential, and correctness of the merge result? For the first time, we evaluate a whole range of merge algorithms (from unstructured over semistructured to structured as well as combinations) by replaying merge commits in a controlled setting. We employ the test suite of the projects in question as an oracle for the correctness of the resulting code, triangulated by a thorough manual analysis. Using 7727 merge commits from 10 open-source projects, we find that combined strategies appear to be the best of both worlds: They resolve as many conflicts as structured merge at a significantly lower runtime per merge commit. Notably, structured merge strategies do cause more test failures, however, the increase is small.
更多
查看译文
关键词
JD(IME),version control systems,software merge,structured merge,semistructured merge
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要