A mixed-integer exponential cone programming formulation for feature subset selection in logistic regression

Sahand Asgharieh Ahari,Burak Kocuk

EURO Journal on Computational Optimization(2023)

引用 0|浏览0
暂无评分
摘要
Logistic regression is one of the widely-used classification tools to construct prediction models. For datasets with a large number of features, feature subset selection methods are considered to obtain accurate and interpretable prediction models, in which irrelevant and redundant features are removed. In this paper, we address the problem of feature subset selection in logistic regression using modern optimization techniques. To this end, we formulate this problem as a mixed-integer exponential cone program (MIEXP). To the best of our knowledge, this is the first time both nonlinear and discrete aspects of the underlying problem are fully considered within an exact optimization framework. We derive different versions of the MIEXP model by the means of regularization and goodness of fit measures including Akaike Information Criterion and Bayesian Information Criterion. Finally, we solve our MIEXP models using the solver MOSEK and evaluate the performance of our different versions over a set of toy examples and benchmark datasets. The results show that our approach is quite successful in obtaining accurate and interpretable prediction models compared to other methods from the literature.
更多
查看译文
关键词
feature subset selection,exponential cone programming formulation,mixed-integer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要