A Hybrid Fault-Tolerant Scheduling for Deadline-Constrained Tasks in Cloud Systems

IEEE Transactions on Services Computing(2022)

引用 11|浏览28
暂无评分
摘要
Among multiple fault-tolerant strategies, resubmission, and replication are fundamental and widely recognized in distributed computing systems. In recent years, many algorithms based on replication or resubmission have been proposed. However, few of them consider these two techniques together, especially in Cloud systems. In this article, we propose a Hybrid Fault-Tolerant Scheduling Algorithm (HFTSA) for independent tasks with deadlines by integrating the above techniques in virtualized Cloud systems. During the task scheduling process, HFTSA selects fault-tolerant strategies from resubmission and replication for each accepted task based on the characteristics of both task and Cloud resources and then reserves suitable resources. During the task execution process, HFTSA adopts an online adjustment scheme for fault-tolerant strategies of some tasks if necessary while providing an online scheduling scheme for faults. Moreover, an elastic resource provisioning mechanism is designed and incorporated into HFTSA to dynamically adjust the provided resources to improve resource utilization. Experiments on a real cloud platform and a simulated platform are conducted to verify the effectiveness of the proposed HFTSA. The results demonstrate that HFTSA can provide an efficient fault-tolerant scheduling strategy for deadline-constrained tasks with high resource utilization and performs better than corresponding competitors.
更多
查看译文
关键词
Cloud systems,deadline,fault-tolerant,resubmission,replication
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要