A 384G Output NonZeros/J Graph Convolutional Neural Network Accelerator

IEEE Transactions on Circuits and Systems II: Express Briefs(2022)

引用 0|浏览7
暂无评分
摘要
This brief presents the first IC implementation of graph convolutional neural network (GCN) accelerator chip. A sparsity aware dataflow optimized for sub-block-wise processing of three different matrices in GCN is proposed to improve the utilization ratio of computing resources while reducing the amount of redundant access of off-chip memory. The implemented accelerator in 28-nm CMOS produces 384G NZ outputs/J for the extremely sparse matrix multiplications of the GCN. It shows 58k-to-143k, 38k-to-92k and 5k-to-13k Graph/J for the benchmark graph datasets of Cora, Citeseer and Pubmed, respectively. The energy efficiency in Graph/J of the proposed 16b ASIC implementation shows about 4-to- $11\mathbf {\times }$ and 8-to- $25\mathbf {\times }$ improvements compared to the previously reported 8b FPGA and 32b FPGA implementations, respectively.
更多
查看译文
关键词
Graph convolutional neural network (GCN),hardware accelerator,machine learning accelerator,sparse matrix multiplication,application-specific integrated circuit (ASIC)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要