Towards Efficient SRAM-PIM Architecture Design by Exploiting Unstructured Bit-Level Sparsity
arxiv(2024)
摘要
Bit-level sparsity in neural network models harbors immense untapped
potential. Eliminating redundant calculations of randomly distributed zero-bits
significantly boosts computational efficiency. Yet, traditional digital
SRAM-PIM architecture, limited by rigid crossbar architecture, struggles to
effectively exploit this unstructured sparsity. To address this challenge, we
propose Dyadic Block PIM (DB-PIM), a groundbreaking algorithm-architecture
co-design framework. First, we propose an algorithm coupled with a distinctive
sparsity pattern, termed a dyadic block (DB), that preserves the random
distribution of non-zero bits to maintain accuracy while restricting the number
of these bits in each weight to improve regularity. Architecturally, we develop
a custom PIM macro that includes dyadic block multiplication units (DBMUs) and
Canonical Signed Digit (CSD)-based adder trees, specifically tailored for
Multiply-Accumulate (MAC) operations. An input pre-processing unit (IPU)
further refines performance and efficiency by capitalizing on block-wise input
sparsity. Results show that our proposed co-design framework achieves a
remarkable speedup of up to 7.69x and energy savings of 83.43
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要