Exploiting Sample Diversity In Distributed Machine Learning Systems

CCGRID '16: Proceedings of the 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing（2016）

引用 0|浏览30

暂无评分

摘要

With the increase of machine learning scalability, there is a growing need for distributed systems which can execute machine learning algorithms on large clusters. Currently, most distributed machine learning systems are developed based on iterative optimization algorithm and parameter server framework. However, most systems compute on all samples in every iteration and this method consumes too much computing resources since the amount of samples is always too large. In this paper, we study on the sample diversity and find that most samples contribute little to model updating during most iterations. Based on these findings, we propose a new iterative optimization algorithm to reduce the computation load by reusing the iterative computing results. The experiment demonstrates that, compared to the current methods, the algorithm proposed in this paper can reduce about 23% of the whole computation load without increasing of communications.

查看译文

关键词

computation load reduction,model updating,parameter server framework,iterative optimization algorithm,machine learning scalability,distributed machine learning systems,sample diversity

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要