Autoscaling for Hadoop Clusters
2016 IEEE International Conference on Cloud Engineering (IC2E)(2016)
摘要
Unforeseen events such as node failures and resource contention can have a severe impact on the performance of data processing frameworks, such as Hadoop, especially in cloud environments where such incidents are common. SLA compliance in the presence of such events requires the ability to quickly and dynamically resize infrastructure resources. Unfortunately, the distributed and stateful nature of data processing frameworks makes it challenging to accurately scale the system at run-time. In this paper, we present the design and implementation of a model-driven autoscaling solution for Hadoop clusters. We first develop novel gray-box performance models for Hadoop workloads that specifically relate job execution times to resource allocation and workload parameters. We then employ these models to dynamically determine the resources required to successfully complete the Hadoop jobs as per the user-specified SLA under various scenarios including node failures and multi-job executions. Our experimental results on three different Hadoop cloud clusters and across different workloads demonstrate the efficacy of our models and highlight their autoscaling capabilities.
更多查看译文
关键词
AutoScaling,Performance Modeling,Hadoop
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络