Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training

international conference on learning representations, 2018.

Cited by: 295|Bibtex|Views26|Links
EI

Abstract:

Large-scale distributed training requires significant communication bandwidth for gradient exchange that limits the scalability of multi-node training, and requires expensive high-bandwidth network infrastructure. The situation gets even worse with distributed training on mobile devices (federated learning), which suffers from higher late...More

Code:

Data:

Your rating :
0

 

Tags
Comments