A parameter-level parallel optimization algorithm for large-scale spatio-temporal data mining
DISTRIBUTED AND PARALLEL DATABASES(2020)
摘要
The goal of spatio-temporal data mining is to discover previously unknown but useful patterns from the spatial and temporal data. However, explosive growth of the spatiotemporal data emphasizes the need for developing novel computationally efficient methods for large-scale data mining applications. Since lots of spatiotemporal data mining problems can be converted to an optimization problem, in this paper, we propose an efficient parameter-level parallel optimization algorithm for large-scale spatiotemporal data mining. In detail, most of previous optimization methods are based on gradient descent methods, which iteratively update the model and provide model-level convergence control for all parameters. Namely, they treat all parameters equally and keep updating all parameters until every parameter has converged. However, we find that during the iterative process, the convergence rates of model parameters are different from each other. This may cause redundant computation and reduce the performance. To solve this problem, we propose a parameter-level stochastic gradient descent (plpSGD), in which the convergence of each parameter is considered independently and only unconvergent parameters are updated in each iteration. Moreover, the updating of model parameters are parallelized in plpSGD to further improve the performance of SGD. We have conducted extensive experiments to evaluate the performance of plpSGD. The experimental results show that compared to previous SGD methods, plpSGD can significantly accelerate the convergence of SGD and achieve the excellent scalability with little sacrifice of the solution accuracy.
更多查看译文
关键词
Spatio-temporal data mining,Stochastic gradient descent,Block,Convergent rate,Redundant update
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络