ARMAN: A Reconfigurable Monolithic 3D Accelerator Architecture for Convolutional Neural Networks
CoRR(2024)
摘要
The Convolutional Neural Network (CNN) has emerged as a powerful and
versatile tool for artificial intelligence (AI) applications. Conventional
computing architectures face challenges in meeting the demanding processing
requirements of compute-intensive CNN applications, as they suffer from limited
throughput and low utilization. To this end, specialized accelerators have been
developed to speed up CNN computations. However, as we demonstrate in this
paper via extensive design space exploration, different neural network models
have different characteristics, which calls for different accelerator
architectures and configurations to match their computing demand. We show that
a one-size-fits-all fixed architecture does not guarantee optimal
power/energy/performance trade-off. To overcome this challenge, this paper
proposes ARMAN, a novel reconfigurable systolic-array-based accelerator
architecture based on Monolithic 3D (M3D) technology for CNN inference. The
proposed accelerator offers the flexibility to reconfigure among different
scale-up or scale-out arrangements depending on the neural network structure,
providing the optimal trade-off across power, energy, and performance for
various neural network models. We demonstrate the effectiveness of our approach
through evaluations of multiple benchmarks. The results demonstrate that the
proposed accelerator exhibits up to 2x, 2.24x, 1.48x, and 2x improvements in
terms of execution cycles, power, energy, and EDP respectively, over the
non-configurable architecture.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要