A Time-efficient and High-performance FPGA-based Continuous Floating-point Matrix Computing Accelerating Architecture for Control System
2020 International Conference on Information Science, Parallel and Distributed Systems (ISPDS)(2020)
摘要
Matrix computing is one of the most important linear algebra modes that is broadly used in both scientific and engineering applications. Currently, there is still a lot of space for the optimization of continuous matrix computing accelerating. In this study, we first present two memory access optimization schemes which significantly minimize the I/O time and the total delay. Then, we extend the data accuracy of continuous matrix computing from double-precision to single-precision and half-precision floating-point data, which can enhance data diversity and improve computing performance. The experiments show that the I/O time is reduced by 40% after coarse-grained parallel optimization. Moreover, the I/O time is almost completely hidden by the calculation time after fine-grained data flow optimization. The accelerator achieves a maximum frequency of180 Mhz with 128 PEs and performs 184.3 GFLOPS for half-precision floating-point data. Our design is more outstanding in time-efficient and application scope comparing with state-of-the-art FPGA-based structures.
更多查看译文
关键词
matrix computing,accelerator,time-efficient,high-performance,floating-point,FPGAs
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要