Perforator: Eloquent Performance Models For Resource Optimization

Kaushik Rajan,Dharmesh Kakadia,Carlo Curino,Subru Krishnan

SoCC '16: ACM Symposium on Cloud Computing Santa Clara CA USA October, 2016（2016）

引用 44|浏览157

暂无评分

摘要

Query Optimization focuses on finding the best query execution plan, given fixed hardware resources. In BigData settings, both pay-as-you-go clouds and on-prem shared clusters, a complementary challenge emerges: Resource Optimization: find the best hardware resources, given an execution plan. In this world, provisioning is almost instantaneous and time-varying resources can be acquired on a per-query basis. This allows us to optimize allocations for completion time, resource usage, dollar cost, etc. These optimizations have a huge impact on performance and cost, and pivot around a core challenge: faithful resource-to-performance models for arbitrary BigData queries. This task is challenging for users and tools alike due to lack of good statistics (high-velocity, unstructured data), frequent use of UDFs, impact on performance of different hardware types and a lack of understanding of parallel execution at such a scale.We address this with PerfOrator, a novel approach to resource-to-performance modeling. PerfOrator employs non-linear regression on profile runs to model arbitrary UDFs, calibration queries to generalize across hardware platforms, and analytical framework models to account for parallelism. The resulting estimates are orders of magnitude more accurate than existing approaches (e.g, Hive's optimizer), and have been successfully employed in two resource optimization scenarios: 1) optimize provisioning of clusters in cloud settings-with decisions within 1% of optimal, 2) reserve skyline of resources for SLA jobs-with accuracies over 10x better than human experts.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要