General Purpose Deep Learning Accelerator Based On Bit Interleaving

Liang Chang,Hang Lu,Chenglong Li,Xin Zhao, Zhicheng Hu,Jun Zhou,Xiaowei Li

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems（2023）

引用 0|浏览0

暂无评分

摘要

Along with the rapid evolution of deep neural networks, the ever-increasing complexity imposes formidable computation intensity on the hardware accelerator. In this paper, we propose a novel computing philosophy called “bit interleaving” and the associate accelerator couple called “Bitlet” and Bitlet-X to maximally exploit the bit-level sparsity. Apart from the existing bit-serial/parallel accelerators, Bitlet leverages the abundant “sparsity parallelism” in the parameters to enforce the inference acceleration. Bitlet is versatile by supporting diverse precisions on a single platform, including floating-point 32 and fixed-point from 1b to 24b. The versatility enables Bitlet feasible for both efficient inference and training. Besides, by updating the key compute engine in the accelerator, Bitlet-X could furthermore improve the peak power consumption and efficiency for the inference-only scenario, with competitive accuracy. Empirical studies on 12 domain-specific deep learning applications highlight the following results: (1) up to 81×/21× energy efficiency improvement for training/inference over recent high-performance GPUs; (2) up to 15×/8× higher speedup/efficiency over state-of-the-art fixed-point accelerators; (3) 1.5mm2 area and scalable power consumption from 570mW (fp32) to 432mW (16b) and 365mW (8b) @28nm TSMC; (4) 1.3× improvement of the peak power efficiency for the Bitlet-X over Bitlet; (5) highly configurable justified by the ablation and sensitivity studies.

查看译文

关键词

Bit-Level Sparsity,Deep Neural Network,Accelerator

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要