mlscorecheck: Testing the consistency of reported performance scores and experiments in machine learning

György Kovács,Attila Fazekas

Neurocomputing(2024)

引用 0|浏览2
暂无评分
摘要
Addressing the reproducibility crisis in artificial intelligence through the validation of reported experimental results is a challenging task. It necessitates either the reimplementation of techniques or a meticulous assessment of papers for deviations from the scientific method and best statistical practices. To facilitate the validation of reported results, we have developed numerical techniques capable of identifying inconsistencies between reported performance scores and various experimental setups in machine learning problems, including binary/multiclass classification and regression. These consistency tests are integrated into the open-source package mlscorecheck, which also provides specific test bundles designed to detect systematically recurring flaws in various fields, such as retina image processing and synthetic minority oversampling.
更多
查看译文
关键词
Binary classification,Multiclass classification,Regression,Consistency testing,Performance scores,Open source
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要