A 17–95.6 TOPS/W Deep Learning Inference Accelerator with Per-Vector Scaled 4-Bit Quantization for Transformers in 5Nm

2022 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits)(2022)

Cited 10|Views69
No score
Key words
DNN inference accelerator,BERT,transformers
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined