An efficient kernel transformation architecture for binary- and ternary-weight neural network inference

2018 55TH ACM/ESDA/IEEE DESIGN AUTOMATION CONFERENCE (DAC)(2018)

引用 18|浏览49
暂无评分
摘要
While deep convolutional neural networks (CNNs) have emerged as the driving force of a wide range of domains, their computationally and memory intensive natures hinder the further deployment in mobile and embedded applications. Recently, CNNs with low-precision parameters have attracted much research attention. Among them, multiplier-free binary- and ternary-weight CNNs are reported to be of comparable recognition accuracy with full-precision networks, and have been employed to improve the hardware efficiency. However, even with the weights constrained to binary and ternary values, large-scale CNNs still require billions of operations in a single forward propagation pass. In this paper, we introduce a novel approach to maximally eliminate redundancy in binary- and ternary-weight CNN inference, improving both the performance and energy efficiency. The initial kernels are transformed into much fewer and sparser ones, and the output feature maps are rebuilt from the immediate results. Overall, the number of total operations in convolution is reduced. To find an efficient transformation solution for each already trained network, we propose a searching algorithm, which iteratively matches and eliminates the overlap in a set of kernels. We design a specific hardware architecture to optimize the implementation of kernel transformation. Specialized dataflow and scheduling method are proposed. Tested on SVHN, AlexNet, and VGG-16, our architecture removes 43.4%--79.9% operations, and speeds up the inference by 1.48--3.01 times.
更多
查看译文
关键词
ternary-weight neural network inference,deep convolutional neural networks,memory intensive natures,mobile embedded applications,low-precision parameters,full-precision networks,hardware efficiency,large-scale CNNs,single forward propagation pass,energy efficiency,transformation solution,hardware architecture,multiplier-free ternary-weight CNNs,multiplier-free binary-weight CNNs,SVHN,VGG,kernel transformation architecture,binary-weight neural network inference,searching algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要