An SMDP approach for Reinforcement Learning in HPC cluster schedulers

Future Generation Computer Systems(2023)

引用 1|浏览2
暂无评分
摘要
Deep reinforcement learning applied to computing systems has shown potential for improving system performance, as well as faster discovery of better allocation strategies. In this paper, we map HPC batch job scheduling to the SMDP formalism, and present an online, deep reinforcement learning-based solution that uses a modification of the Proximal Policy Optimization algorithm for minimizing job slowdown with action masking, supporting large action spaces. In our experiments, we assess the effects of noise in run time estimates in our model, evaluating how it behaves in small (64 processors) and large (16384 processors) clusters. We also show our model is robust to changes in workload and in cluster sizes, showing transfer works with changes of cluster size of up to 10×, and changes from synthetic workload generators to supercomputing workload traces. In our experiments, the proposed model outperforms learning models from the literature and classic heuristics, making it a viable modeling approach for robust, transferable, learning scheduling models.
更多
查看译文
关键词
Deep Reinforcement Learning,Scheduling,Semi-Markov Decision Processes,Workload traces,Machine Learning,Simulation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要