22.1 A 12.4TOPS/W @ 136GOPS AI-IoT System-on-Chip with 16 RISC-V, 2-to-8b Precision-Scalable DNN Acceleration and 30%-Boost Adaptive Body Biasing

2023 IEEE International Solid- State Circuits Conference (ISSCC)(2023)

引用 1|浏览8
暂无评分
摘要
Emerging Artificial Intelligence-enabled Internet-of-Things (Al-loT) SoCs [1–4] for augmented reality, personalized healthcare and nano-robotics need to run a large variety of tasks within a power envelope of a few tens of mW: compute-intensive but bit-precision-tolerant Deep Neural Networks (DNNs), as well as signal processing and control requiring high-precision floating-point. Performance and energy constraints vary greatly between different applications and even within different stages of the same application. We present Marsellus (Fig. 22.1.1), an all-digital Al-loT end-node heterogeneous $\mathsf{SoC}$ fabricated in GlobalFoundries $22\mathsf{nm}$ FDX that combines three key contributions to enable aggressive scaling of performance and energy: 1) a generalpurpose cluster of 16 RISC-V DSP cores attuned for execution of a diverse range of workloads exploiting $4\mathsf{b}$ and $2\mathsf{b}$ arithmetic extensions (XpulpNN), combined with fused MAC $\&$ LOAD (M&L) operations and floating-point support; 2) a 2-8b reconfigurable binary engine to accelerate $3\times 3$ and $1\times 1$ (pointwise) convolutions in DNNs; 3) a set of On-Chip Monitoring (OCM) blocks connected to an Adaptive Body Bias (ABB) generator and a hardware control loop, enabling on-the-fly adaptation of transistor threshold voltages.
更多
查看译文
关键词
ABB generator,adaptive body bias generator,Al-loT end-node heterogeneous SoC,artificial intelligence-enabled Internet-of-Things,augmented reality,bit-precision-tolerant deep neural networks,energy constraints,floating-point support,Global Foundries FDX,high-precision floating-point,nanorobotics,OCM,on-chip monitoring,on-chip monitoring blocks,on-the-fly adaptation,personalized healthcare,power envelope,precision-scalable DNN acceleration,reconfigurable binary engine,RISC-V DSP,signal processing,size 22.0 nm,word length 2 bit to 8 bit
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要