PASS: Exploiting Post-Activation Sparsity in Streaming Architectures for CNN Acceleration

Alexander Montgomerie-Corcoran,Zhewen Yu,Jianyi Cheng,Christos-Savvas Bouganis

CoRR（2023）

引用 0|浏览4

暂无评分

摘要

With the ever-growing popularity of Artificial Intelligence, there is an increasing demand for more performant and efficient underlying hardware. Convolutional Neural Networks (CNN) are a workload of particular importance, which achieve high accuracy in computer vision applications. Inside CNNs, a significant number of the post-activation values are zero, resulting in many redundant computations. Recent works have explored this post-activation sparsity on instruction-based CNN accelerators but not on streaming CNN accelerators, despite the fact that streaming architectures are considered the leading design methodology in terms of performance. In this paper, we highlight the challenges associated with exploiting post-activation sparsity for performance gains in streaming CNN accelerators, and demonstrate our approach to address them. Using a set of modern CNN benchmarks, our streaming sparse accelerators achieve 1.41x to 1.93x efficiency (GOP/s/DSP) compared to state-of-the-art instruction-based sparse accelerators.

查看译文

关键词

streaming architectures,post-activation

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要