General Purpose Deep Learning Accelerator Based On Bit Interleaving

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems(2023)

引用 0|浏览0
暂无评分
摘要
Along with the rapid evolution of deep neural networks, the ever-increasing complexity imposes formidable computation intensity on the hardware accelerator. In this paper, we propose a novel computing philosophy called “bit interleaving” and the associate accelerator couple called “Bitlet” and Bitlet-X to maximally exploit the bit-level sparsity. Apart from the existing bit-serial/parallel accelerators, Bitlet leverages the abundant “sparsity parallelism” in the parameters to enforce the inference acceleration. Bitlet is versatile by supporting diverse precisions on a single platform, including floating-point 32 and fixed-point from 1b to 24b. The versatility enables Bitlet feasible for both efficient inference and training. Besides, by updating the key compute engine in the accelerator, Bitlet-X could furthermore improve the peak power consumption and efficiency for the inference-only scenario, with competitive accuracy. Empirical studies on 12 domain-specific deep learning applications highlight the following results: (1) up to 81×/21× energy efficiency improvement for training/inference over recent high-performance GPUs; (2) up to 15×/8× higher speedup/efficiency over state-of-the-art fixed-point accelerators; (3) 1.5mm2 area and scalable power consumption from 570mW (fp32) to 432mW (16b) and 365mW (8b) @28nm TSMC; (4) 1.3× improvement of the peak power efficiency for the Bitlet-X over Bitlet; (5) highly configurable justified by the ablation and sensitivity studies.
更多
查看译文
关键词
Bit-Level Sparsity,Deep Neural Network,Accelerator
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要