A Time-efficient and High-performance FPGA-based Continuous Floating-point Matrix Computing Accelerating Architecture for Control System

2020 International Conference on Information Science, Parallel and Distributed Systems (ISPDS)(2020)

引用 0|浏览0
暂无评分
摘要
Matrix computing is one of the most important linear algebra modes that is broadly used in both scientific and engineering applications. Currently, there is still a lot of space for the optimization of continuous matrix computing accelerating. In this study, we first present two memory access optimization schemes which significantly minimize the I/O time and the total delay. Then, we extend the data accuracy of continuous matrix computing from double-precision to single-precision and half-precision floating-point data, which can enhance data diversity and improve computing performance. The experiments show that the I/O time is reduced by 40% after coarse-grained parallel optimization. Moreover, the I/O time is almost completely hidden by the calculation time after fine-grained data flow optimization. The accelerator achieves a maximum frequency of180 Mhz with 128 PEs and performs 184.3 GFLOPS for half-precision floating-point data. Our design is more outstanding in time-efficient and application scope comparing with state-of-the-art FPGA-based structures.
更多
查看译文
关键词
matrix computing,accelerator,time-efficient,high-performance,floating-point,FPGAs
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要