MSD: Mixing Signed Digit Representations for Hardware-efficient DNN Acceleration on FPGA with Heterogeneous Resources.

FCCM(2023)

引用 0|浏览35
暂无评分
摘要
By quantizing weights with different precision for different parts of a network, mixed-precision quantization promises to reduce the hardware cost and improve the speed of deep neural network (DNN) accelerators that typically operate with a fixed quantization scheme. However, the additional control needed, and the decreased hardware efficiency arising from multi-precision operations have made mixed-precision quantization schemes challenging to deploy in practice. In this paper, a practical mixed-precision quantization framework called MSD that leverages the heterogeneous computing resources on FPGA to perform bit-serial and bit-parallel operations simultaneously is presented. MSD combines the use of a custom restricted signed digit (RSD) representation, which utilizes a limited number of effectual bits, and the conventional 2's complement representation to quantize DNN weights. Depending on the availability of fine-grained and coarse-grained resources, MSD encodes a subset of weights with RSD to allow highly efficient bit-serial multiply-accumulate implementation using LUT resources. Furthermore, the number of effectual bits used in RSD is optimized to match the bit-serial hardware latency to the bit-parallel operation on the coarse-grained resources to ensure the highest run-time utilization of all on-chip resources. Experiments show that MSD achieved a 1.36x speedup on the ResNet-18 model over the state-of-the-art, and a remarkable 4.91% higher accuracy on MobileNet-V2.
更多
查看译文
关键词
FPGA,Hardware Acceleration,DNN,Signed Digit Reresentation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要