Robust regression via error tolerance

Anton Björklund,Andreas Henelius,Emilia Oikarinen,Kimmo Kallonen,Kai Puolamäki

Data Mining and Knowledge Discovery（2022）

引用 4|浏览13

暂无评分

摘要

Real-world datasets are often characterised by outliers; data items that do not follow the same structure as the rest of the data. These outliers might negatively influence modelling of the data. In data analysis it is, therefore, important to consider methods that are robust to outliers. In this paper we develop a robust regression method that finds the largest subset of data items that can be approximated using a sparse linear model to a given precision. We show that this can yield the best possible robustness to outliers. However, this problem is NP-hard and to solve it we present an efficient approximation algorithm, termed SLISE. Our method extends existing state-of-the-art robust regression methods, especially in terms of speed on high-dimensional datasets. We demonstrate our method by applying it to both synthetic and real-world regression problems.

查看译文

关键词

Robust Regression,Robust Statistics,Outlier Detection,Sparsity

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要