MetricHunter: A software metric dataset generator utilizing SourceMonitor upon public GitHub repositories
SOFTWAREX(2023)
摘要
Version control systems are pervasively consulted nowadays to obtain software metric datasets. Accordingly, machine learning is applied to predict different aspects of a software including quality monitoring, influence analysis, etc. However, construction of a metric dataset is challenging and the dataset content may affect the success of the learning-based models. In this study, we propose a dataset construction tool, MetricHunter, which is able to produce platform/language specific datasets that can be used for predicting the features of newly created software. The proposed tool is developed by C# programming language utilizing a known metric gathering tool, i.e. SourceMonitor, and the GitHub REST API for public repositories. Thus, one can construct a proper dataset from a graphical user interface by simply specifying the programming language or target platform. The outputs of the tool on a set of repositories are validated by investigating automatically generated attribute values and comparing them with the measurements of metric gathering tools as well as the GitHub metric values.& COPY; 2023 The Author(s). Published by Elsevier B.V.This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
更多查看译文
关键词
Software quality,Software metrics,Dataset construction,Git version control system
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要