Applying CNN on a scientific application accelerator based on dataflow architecture

Xiaochun Ye,Taoran Xiang,Xu Tan,Yujing Feng,Haibin Wu,Meng Wu,Dongrui Fan

CCF Transactions on High Performance Computing（2019）

引用 3|浏览77

暂无评分

摘要

Convolutional neural network (CNN) is widely used in applications such as face recognition, intelligent monitoring, image recognition and text recognition. Because of its high computational complexity, many efficient hardware accelerators have been proposed to exploit high degree of parallel processing for CNN. However, accelerators which are implemented on FPGAs and ASICs usually sacrifice generality for higher performance and lower power consumption. Other accelerators, such as GPUs, are general enough, but they lead to higher power consumption. Fine-grained dataflow architectures, which break conventional Von Neumann architectures, show natural advantages in processing scientific applications. Meanwhile, CNN algorithm shares many vital characteristics with scientific applications including high parallelism, simple loop and regular memory accessing pattern. In this paper, we propose a scheme for implementing and optimizing CNN on fine-grained dataflow architecture designed for scientific applications, namely Scientific Processing Unit (SPU). The experiment results reveal that by using our scheme, the performance of AlexNet and VGG-19 running on SPU is averagely 2.29 × higher than that on NVIDIA Titan Xp, and the energy consumption of our hardware is averagely 5.76 × lower than that of Titan Xp.

查看译文

关键词

Fine-grained dataflow,Convolutional neural network,Parallel computing,Accelerator

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要