Improving system latency of AI accelerator with on-chip pipelined activation preprocessing and multi-mode batch inference

2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS)(2021)

引用 2|浏览18
暂无评分
摘要
State-of-the-art neural network accelerators exploit massive computing parallelism to achieve high throughput. However, significant latency is observed on the master-slave-based AI acceleration system which limits its adaptation in real-time applications. Investigation in de-facto GPU system reveals tremendous timing overhead for preprocessing of input activations, which is commonly executed on th...
更多
查看译文
关键词
Power demand,Pipelines,Data preprocessing,Random access memory,Prototypes,Throughput,Real-time systems
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要