Accelerating DCNNs via Cooperative Weight/Activation Compression

Yuhao Zhang,Xikun Jiang, Xinyu Wang, Yudong Pan, Pusen Dong, Bin Sun,Zhaoyan Shen,Zhiping Jia

Lecture Notes in Computer Science（2022）

引用 0|浏览1

暂无评分

摘要

Deep convolutional neural networks (DCNNs) have achieved great success in various applications. Nevertheless, training and deploying such DCNNs models require a huge amount of computation and storage resources. Weight pruning has emerged as an effective compression technique for DCNNs to reduce the consumption of hardware resources. However, existing weight pruning schemes neglect to simultaneously consider accuracy, compression ratio and hardware-efficiency in the design space, which inevitably leads to performance degradation. To overcome these limitations, in this work, we propose a cooperative weight/activation compression approach. Initially, we observe spatially insensitive consistency, namely the insensitive weight values that have no impact on accuracy distributed at the same spatial position in a mass of channels. Based on the key observation, we propose a cluster-based weight pattern pruning technique, which converges the channels with spatially insensitive consistency into a cluster and prunes them into the weight pattern with a uniform shape. In addition, there are many sparsity rows in the activation matrix due to the effect of the activation function–Rectifying Linear Unit (ReLU). We study the sparsity rows and find that rows with larger sparsity degree are insensitive to accuracy. Hence, we further propose a sparsity-row-based activation removal technique, which directly eliminates insensitive rows of activation. The two proposed techniques allow the hardware to execute at excellent parallel granularity and achieve better compression ratio with negligible accuracy loss. Experiment results on several popular DCNNs models show that our scheme reduces the computation by \(63\%\) on average.

查看译文

关键词

Deep convolutional neural networks (DCNNs),Weight,Activation,Clustering,Compression

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要