谷歌浏览器插件
订阅小程序
在清言上使用

LogChunks: A Data Set for Build Log Analysis

International Conference on Software Engineering(2020)

引用 7|浏览43
暂无评分
摘要
Build logs are textual by-products that a software build process creates, often as part of its Continuous Integration (CI) pipeline. Build logs are a paramount source of information for developers when debugging into and understanding a build failure. Recently, attempts to partly automate this time-consuming, purely manual activity have come up, such as rule- or information-retrieval-based techniques. We believe that having a common data set to compare different build log analysis techniques will advance the research area. It will ultimately increase our understanding of CI build failures. In this paper, we present LogChunks, a collection of 797 annotated Travis CI build logs from 80 GitHub repositories in 29 programming languages. For each build log, LogChunks contains a manually labeled log part (chunk) describing why the build failed. We externally validated the data set with the developers who caused the original build failure. The width and depth of the LogChunks data set are intended to make it the default benchmark for automated build log analysis techniques.
更多
查看译文
关键词
CI,Build Log Analysis,Build Failure,Chunk Retrieval
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要