Training Deep Nets with Progressive Batch Normalization on Multi-GPUs

International journal of parallel programming(2018)

引用 10|浏览181
暂无评分
摘要
Batch normalization (BN) enables us to train various deep neural networks faster. However, the training accuracy will be significantly influenced with the decrease of input mini-batch size. To increase the model accuracy, a global mean and variance among all the input batch can be used, nevertheless communication across all devices is required in each BN layer, which reduces the training speed greatly. To address this problem, we propose progressive batch normalization, which can achieve a good balance between model accuracy and efficiency in multiple-GPU training. Experimental results show that our algorithm can obtain significant performance improvement over traditional BN without data synchronization across GPUs, achieving up to 18.4% improvement on training DeepLab for semantic segmentation task across 8 GPUs.
更多
查看译文
关键词
Batch normalization,Data parallelism,Deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要