Gradient Staleness in Asynchronous Optimization Under Random Communication Delays

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)（2022）

引用 1|浏览9

暂无评分

摘要

Distributed optimization is widely used to solve large-scale optimization problems by parallelizing gradient-based algorithms across multiple computing nodes. In asynchronous optimization, the optimization parameter is updated using stale gradients, which are gradients calculated with respect to out-of-date parameters. Although large degrees of staleness can slow convergence, little is known about the impact of staleness and its relation to other system parameters. In this work, we study and analyze centralized asynchronous optimization. We show that the process of gradient arrival to the master node is similar in nature to a renewal process. We derive bounds on expected staleness and show its connection to other system parameters such as the number of workers, expected compute time and communication delays. Our derivations can be used in existing convergence analyses to express convergence rates in terms of other known system parameters. Such an expression gives further details on what factors impact convergence.

查看译文

关键词

expected compute time,gradient staleness,random communication delays,distributed optimization,large-scale optimization problems,computing nodes,out-of-date parameters,gradient-based algorithm,centralized asynchronous optimization,gradient arrival process,master node,renewal process,convergence analyses

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要