A 17–95.6 TOPS/W Deep Learning Inference Accelerator with Per-Vector Scaled 4-Bit Quantization for Transformers in 5Nm
2022 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits)(2022)
关键词
DNN inference accelerator,BERT,transformers
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要