Multiple Imputation Ensembles for Time Series (MIE-TS)

ACM Transactions on Knowledge Discovery from Data(2022)

引用 1|浏览15
暂无评分
摘要
Time series classification has become an interesting field of research thanks to the extensive studies conducted in the past two decades. Time series may have missing data which may affect both the representation and also modeling of time series. Thus, recovering missing data using appropriate time series based imputation methods is an essential step. Multiple imputation is a data recovery method where it produced multiple imputed data. The method proves its usefulness in terms of reflecting the uncertainty inherit in missing data, however, it is under-researched in time series problems. In this paper we propose two multiple imputation approaches for time series. The first is a multiple imputation method based on interpolation. The second is a multiple imputation and ensemble method. First we simulate missing consecutive sub-sequences under a Missing Completely at Random mechanism; then we use single/multiple imputation methods. The imputed data are used to build bagging and stacking ensembles. We build ensembles using standard classification algorithms as well as time series classifiers. The standard classifiers involve Random Forest, Support Vector Machines, K-Nearest Neighbour, C4.5, and PART while TSCHIEF, Proximity Forest, Time Series Forest, RISE and BOSS are chosen as time series classifiers. Our findings show that the combination of multiple imputation and ensemble improves the performance of the majority of classifiers tested in this study, often above the performance obtained from the complete data, even under increasing missing data scenarios. This may be because the diversity injected by multiple imputation has a very favourable and stabilising effect on the classifier performance, which is a very important finding.
更多
查看译文
关键词
Missing data,multiple imputation,time series,ensemble methods
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要