A Pipelining Loop Optimization Method for Dataflow Architecture

Xu Tan,Xiao-Chun Ye,Xiao-Wei Shen,Yuan-Chao Xu,Da Wang,Lunkai Zhang,Wen-Ming Li,Dong-Rui Fan,Zhi-Min Tang

J. Comput. Sci. Technol.（2018）

引用 9|浏览103

暂无评分

摘要

With the coming of exascale supercomputing era, power efficiency has become the most important obstacle to build an exascale system. Dataflow architecture has native advantage in achieving high power efficiency for scientific applications. However, the state-of-the-art dataflow architectures fail to exploit high parallelism for loop processing. To address this issue, we propose a pipelining loop optimization method (PLO), which makes iterations in loops flow in the processing element (PE) array of dataflow accelerator. This method consists of two techniques, architecture-assisted hardware iteration and instruction-assisted software iteration. In hardware iteration execution model, an on-chip loop controller is designed to generate loop indexes, reducing the complexity of computing kernel and laying a good foundation for pipelining execution. In software iteration execution model, additional loop instructions are presented to solve the iteration dependency problem. Via these two techniques, the average number of instructions ready to execute per cycle is increased to keep floating-point unit busy. Simulation results show that our proposed method outperforms static and dynamic loop execution model in floating-point efficiency by 2.45x and 1.1x on average, respectively, while the hardware cost of these two techniques is acceptable.

查看译文

关键词

dataflow model, control-flow model, loop optimization, exascale computing, scientific application

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要