CANNON: Communication-Aware Sparse Neural Network Optimization

IEEE transactions on emerging topics in computing(2023)

引用 0|浏览30
暂无评分
摘要
Sparse deep neural networks (DNNs) have the potential to deliver compelling performance and energy efficiency without significant accuracy loss. However, their benefits can quickly diminish if their training is oblivious to the target hardware. For example, fewer critical connections can have a significant overhead if they translate into long-distance communication on the target hardware. Therefore, hardware-aware sparse training is needed to leverage the full potential of sparse DNNs. To this end, we propose a novel and comprehensive communication-aware sparse DNN optimization framework for tile-based in-memory computing (IMC) architectures. The proposed technique, CANNON first maps the DNN layers onto the tiles of the target architecture. Then, it replaces the fully connected and convolutional layers with communication-aware sparse connections. After that, CANNON optimizes the communication cost with minimal impact on the DNN accuracy. Extensive experimental evaluations with a wide range of DNNs and datasets show up to 3.0× lower communication energy, 3.1× lower communication latency, and 6.8× lower energy-delay product compared to state-of-the-art pruning approaches with a negligible impact on the classification accuracy on IMC-based machine learning accelerators.
更多
查看译文
关键词
Hardware-aware pruning,communication-aware pruning,mapping,sparse neural networks
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要