Real-Time Mobile Acceleration of DNNs: From Computer Vision to Medical Applications

ASPDAC(2021)

引用 10|浏览37
暂无评分
摘要
ABSTRACTWith the growth of mobile vision applications, there is a growing need to break through the current performance limitation of mobile platforms, especially for computationally intensive applications, such as object detection, action recognition, and medical diagnosis. To achieve this goal, we present our unified real-time mobile DNN inference acceleration framework, seamlessly integrating hardware-friendly, structured model compression with mobile-targeted compiler optimizations. We aim at an unprecedented, realtime performance of such large-scale neural network inference on mobile devices. A fine-grained block-based pruning scheme is proposed to be universally applicable to all types of DNN layers, such as convolutional layers with different kernel sizes and fully connected layers. Moreover, it is also successfully extended to 3D convolutions. With the assist of our compiler optimizations, the fine-grained block-based sparsity is fully utilized to achieve high model accuracy and high hardware acceleration simultaneously. To validate our framework, three representative fields of applications are implemented and demonstrated, object detection, activity detection, and medical diagnosis. All applications achieve real-time inference using an off-the-shelf smartphone, outperforming the representative mobile DNN inference acceleration frameworks by up to 6.7x in speed. The demonstrations of these applications can be found in the following link: https://bit.ly/39lWpYu.
更多
查看译文
关键词
computer vision, real-time, mobile acceleration
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要