WMDRS: Workload-Aware Performance Model Based Multi-Task Dynamic-Quota Real-Time Scheduling for Neural Processing Units

2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)(2023)

引用 0|浏览8
暂无评分
摘要
To further improve the capacity of airborne embedded system for dealing with deep learning (DL) applications and reduce overall power consumption, it is necessary to equip Neural Processing Units (NPUs). Comparing with the cloud system, the airborne embedded system usually has a fixed application set, but strict real-time constraints. Unfortunately, the inherent NPU scheduler does not consider the application priority, which cannot provide the sufficient real-time capability for the airborne embedded system. At present, there are few researches on multi-task real-time scheduling for NPUs. Therefore, we propose WMDRS, a workload-aware performance model multi-task dynamic-quota real-time scheduling for Neural Processing Units. The NPU performance model based on workload-awareness can accurately predict the remaining execution time of a task, which is running concurrently with other tasks on NPU. The multi-task dynamic-quota real-time scheduling algorithm can provide the approximate preemption by dynamically adjusting NPU computing resources for active applications. In addition, we implement a prototype NPU scheduler without any hardware extension. Furthermore, the proposed NPU performance model and real-time scheduling algorithm are evaluated in realistic application sets. Experimental results demonstrate that WMDRS can achieve low prediction error and high scheduling success ratio.
更多
查看译文
关键词
real-time scheduling,NPU performance model,embedded system,preemptive scheduling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要