Lynceus: Cost-efficient Tuning and Provisioning of Data Analytic Jobs

CoRR(2020)

引用 16|浏览0
暂无评分
摘要
Modern data analytic and machine learning jobs find in the cloud a natural deployment platform to satisfy their notoriously large resource requirements. Yet, to achieve cost efficiency, it is crucial to identify a deployment configuration that satisfies user-defined QoS constraints (e.g., on execution time), while avoiding unnecessary over-provisioning.This paper introduces Lynceus, a new approach for the optimization of cloud-based data analytic jobs that improves over state-of-the-art approaches by enabling significant cost savings both in terms of the final recommended configuration and of the optimization process used to recommend configurations.Unlike existing solutions, Lynceus optimizes in a joint fashion both the cloud-related (i.e., which and how many machines to provision) and the application-level (e.g. the hyper-parameters of a machine learning algorithm) parameters. This allows for a reduction of the cost of recommended configurations by up to 3.7× at the 90-th percentile with respect to existing approaches, which treat the optimization of cloud-related and application- level parameters as two independent problems.Further, Lynceus reduces the cost of the optimization process (i.e., the cloud cost incurred for testing configurations) by up to 11×. Such an improvement is achieved thanks to two mechanisms: i) a timeout approach which allows to abort the exploration of configurations that are deemed suboptimal, while still extracting useful information to guide future explorations and to improve its predictive model - differently from recent works, which either incur the full cost for testing suboptimal configurations or are unable to extract any knowledge from aborted runs; ii) a long-sighted and budget-aware technique that determines which configurations to test by predicting the long-term impact of each exploration - unlike state-of-the-art approaches for the optimization of cloud jobs, which adopt greedy optimization methods.
更多
查看译文
关键词
cloud computing,machine learning platforms,optimization,virtual machines,Bayesian optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要