Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction.

Expert Syst. Appl.(2016)

引用 373|浏览155
暂无评分
摘要
We propose a novel ensemble model for bankruptcy prediction.We use Extreme Gradient Boosting as an ensemble of decision trees.We propose a new approach for generating synthetic features to improve prediction.The presented method is evaluated on real-life data of Polish companies. Bankruptcy prediction has been a subject of interests for almost a century and it still ranks high among hottest topics in economics. The aim of predicting financial distress is to develop a predictive model that combines various econometric measures and allows to foresee a financial condition of a firm. In this domain various methods were proposed that were based on statistical hypothesis testing, statistical modeling (e.g., generalized linear models), and recently artificial intelligence (e.g., neural networks, Support Vector Machines, decision tress). In this paper, we propose a novel approach for bankruptcy prediction that utilizes Extreme Gradient Boosting for learning an ensemble of decision trees. Additionally, in order to reflect higher-order statistics in data and impose a prior knowledge about data representation, we introduce a new concept that we refer as to synthetic features. A synthetic feature is a combination of the econometric measures using arithmetic operations (addition, subtraction, multiplication, division). Each synthetic feature can be seen as a single regression model that is developed in an evolutionary manner. We evaluate our solution using the collected data about Polish companies in five tasks corresponding to the bankruptcy prediction in the 1st, 2nd, 3rd, 4th, and 5th year. We compare our approach with the reference methods.
更多
查看译文
关键词
Bankruptcy prediction,Extreme gradient boosting,Synthetic features generation,Imbalanced data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要