A 127.8TOPS/W Arbitrarily Quantized 1-to-8b Scalable-Precision Accelerator for General-Purpose Deep Learning with Reduction of Storage, Logic and Latency Waste.
ISSCC(2023)
Key words
arbitrary quantization,compressed format,conventional INT multiplication,data bandwidth,data format,data movement,dynamic-precision bit-serial multiplication,extended-precision AQ computing hardware,general accelerator architecture,inference tasks,layer-by-layer characteristics,layer-by-layer configuration,linear quantization,look-up-table,multiply-and-accumulate operations,nonlinear quantization,quantization schemes,run-length coding,scalable-precision general-purpose deep learning accelerators,sparsity-aware accelerator,word length 1 bit to 8 bit,zero elimination scheme
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined