Large Scale Support Vector Regression For Aviation Safety

2015 IEEE International Conference on Big Data (Big Data)(2015)

引用 1|浏览57
暂无评分
摘要
Regression problems on massive data sets are ubiquitous in many application domains including the Internet, earth and space sciences, and aviation. Support vector regression (SVR) is a popular technique for modeling the input-output relations of a set of variables under the added constraint of maximizing the margin, thereby leading to a very generalizable and regularized model. However, for a dataset with m training points, it is challenging to build SVR models due to the O(m(3)) cost involved in building them. In this paper we propose ParitoSVR - a parallel iterated optimizer for Support Vector Regression in the primal that can be deployed over a network of machines, where each machine iteratively solves a small (sub-)problem based only on the data observed locally and these solutions are then combined to form the solution to the global problem. Our proposed method is based on the Alternating Direction Method of Multipliers (ADMM) optimization technique. Unlike many other existing techniques, ParitoSVR is provably convergent to the results obtained from the centralized algorithm, where the optimization has access to the entire data set. The experimental results show that the algorithm is scalable both with respect to accuracy and time to convergence. We use ParitoSVR to identify flights having anomalous fuel consumption from a large fleet-wide commercial aviation database containing thousands of flights. Along with the algorithmic contributions, this paper also describes the process of deployment of the ADMM-based SVR method on a multicore architecture, namely, the NASA Pleiades supercomputing infrastructure. We have been successful in running ParitoSVR on millions of training data points and hundreds of compute nodes.
更多
查看译文
关键词
distributed optimization,support vector regression,aviation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要