A 127.8TOPS/W Arbitrarily Quantized 1-to-8b Scalable-Precision Accelerator for General-Purpose Deep Learning with Reduction of Storage, Logic and Latency Waste.

Seunghyun Moon,Han-Gyeol Mun,Hyunwoo Son,Jae-Yoon Sim

ISSCC（2023）

Cited 4|Views29

No score

Key words

arbitrary quantization,compressed format,conventional INT multiplication,data bandwidth,data format,data movement,dynamic-precision bit-serial multiplication,extended-precision AQ computing hardware,general accelerator architecture,inference tasks,layer-by-layer characteristics,layer-by-layer configuration,linear quantization,look-up-table,multiply-and-accumulate operations,nonlinear quantization,quantization schemes,run-length coding,scalable-precision general-purpose deep learning accelerators,sparsity-aware accelerator,word length 1 bit to 8 bit,zero elimination scheme

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined