Cost-effective fixed-point hardware support for RISC-V embedded systems

Journal of Systems Architecture(2022)

引用 5|浏览10
暂无评分
摘要
With the ever-increasing energy-efficiency requirements for the computing platforms at the edge, precision tuning techniques highlight the possibility of improving the efficiency of floating-point computations by selectively lowering the precision of intermediate operations without affecting the accuracy of the final result. Recent trends also demonstrated the possibility of successfully employing fixed-point computations in place of floating-point ones to further optimize the efficiency in the high-performance computing (HPC) domain. However, the use of integer functional units to support the execution of fixed-point operations in embedded platforms can severely degrade the energy-delay product (EDP). This work presents a cost-effective architecture to efficiently support fixed-point computations in embedded systems with two goals. On the one hand, it allows replacing the floating-point computations, with meaningful area and EDP improvements. On the other hand, it can complement the FPU, providing flexibility in selecting the best arithmetic for the target applications depending on their accuracy and performance requirements. Experimental results were collected from a representative set of floating-point-intensive applications executed on six variants of the baseline system-on-chip (SoC). We compared the efficiency of different floating- and fixed-point architectures in terms of accuracy, area, and EDP. Our fixed-point solution demonstrated a 0.651 EDP, normalized with respect to a SoC that featured a binary32 FPU, while achieving a negligible accuracy loss, i.e., 0.0003% on average (0.004% peak), compared to binary32 execution. In contrast, a SoC employing the integer functional units for fixed-point execution reports a 1.796 normalized EDP to achieve the same accuracy loss. Compared to the baseline SoC implementing only the integer units, the proposed architecture shows an area overhead limited to 4%, while the SoC featuring binary32 floating-point hardware support requires 32% more resources.
更多
查看译文
关键词
Fixed-point arithmetic,Efficient computing,Power-performance,Embedded systems,Compilers,Microarchitecture
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要