Reducing Load-Use Dependency-Induced Performance Penalty in the Open-Source RISC-V CVA6 CPU.

Euromicro Symposium on Digital Systems Design(2023)

引用 0|浏览0
暂无评分
摘要
Embedded CPUs play a critical role in many modern electronic devices and are commonly used in a range of applications, from IoT edge, to automotive and industrial systems. In particular, application class processors are designed to run operating systems such as Linux, providing a platform for running a broad range of software ecosystems. As such, the performance of these processors is critical for ensuring that these systems can operate efficiently and reliably. However, performance enhancements should be area and power neutral to avoid significant impacts on cost and energy efficiency. In this work, we aim to enhance the performance of CVA6, an Open-Source application class RISC-V core. CVA6's performance has been analyzed with the Embench-IoT benchmark suite, which revealed that load-use dependencies were a key cause of stalls on which CVA6 could be improved. To improve load-use dependency handling, we propose an optimization to the micro-architecture of the processor's backend. Specifically, the backend was redesigned by replacing the scoreboard mechanism of CVA6 with a deeper pipeline that includes a second ALU dedicated to executing instructions with load-use dependencies. The new implementation resulted in a 6.5% improvement in IPC on average and a peak of 29% in applications that suffer heavily from load-use dependencies in Embench-IoT. Additionally, the proposed micro-architecture reduces area and power by 2.5% and increases the clock speed by 4%, leading to an overall improvement of the performance of 11% in instruction throughput and 6.5% more efficiency.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要