MemPool Meets Systolic: Flexible Systolic Computation in a Large Shared-Memory Processor Cluster

2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE(2023)

引用 0|浏览4
暂无评分
摘要
Systolic arrays and shared-memory manycore clusters are two widely used architectural templates that offer vastly different trade-offs. Systolic arrays achieve exceptional performance for workloads with regular dataflow at the cost of a rigid architecture and programming model. Shared-memory manycore systems are more flexible and easy to program, but data must be moved explicitly to/from cores. This work combines the best of both worlds by adding a systolic overlay to a general-purpose shared-memory manycore cluster allowing for efficient systolic execution while maintaining flexibility. We propose and implement two instruction set architecture extensions enabling native and automatic communication between cores through shared memory. Our hybrid approach allows configuring different systolic topologies at execution time and running hybrid systolic-shared-memory computations. The hybrid architecture's convolution kernel outperforms the optimized shared-memory one by 18%.
更多
查看译文
关键词
manycore,RISC-V,systolic array
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要