Ditto: Efficient Serverless Analytics with Elastic Parallelism

Chao Jin,Zili Zhang, Xingyu Xiang, Songyun Zou,Gang Huang,Xuanzhe Liu,Xin Jin

PROCEEDINGS OF THE 2023 ACM SIGCOMM 2023 CONFERENCE, SIGCOMM 2023(2023)

引用 1|浏览7
暂无评分
摘要
Serverless computing provides fine-grained resource elasticity for data analytics-a job can flexibly scale its resources for each stage, instead of sticking to a fixed pool of resources throughout its lifetime. Due to different data dependencies and different shuffling overheads caused by intra- and inter-server communication, the best degree of parallelism (DoP) for each stage varies based on runtime conditions. We present Ditto, a job scheduler for serverless analytics that leverages fine-grained resource elasticity to optimize for job completion time (JCT) and cost. The key idea of Ditto is to use a new scheduling granularity-stage group-to decouple parallelism configuration from function placement. Ditto bundles stages into stage groups based on their data dependencies and IO characteristics. It exploits the parallelized time characteristics of the stages to determine the parallelism configuration, and prioritizes the placement of stage groups with large shuffling traffic, so that the stages in these groups can leverage zero-copy intra-server communication for efficient shuffling. We build a system prototype of Ditto and evaluate it with a variety of benchmarking workloads. Experimental results show that Ditto outperforms existing solutions by up to 2.5x on JCT and up to 1.8x on cost.
更多
查看译文
关键词
Serverless computing,data analytics,task scheduling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要