谷歌浏览器插件
订阅小程序
在清言上使用

Network-Aware Locality Scheduling for Distributed Data Operators in Data Centers

IEEE Transactions on Parallel and Distributed Systems(2021)

引用 49|浏览41
暂无评分
摘要
Large data centers are currently the mainstream infrastructures for big data processing. As one of the most fundamental tasks in these environments, the efficient execution of distributed data operators (e.g., join and aggregation) are still challenging current data systems, and one of the key performance issues is network communication time. State-of-the-art methods trying to improve that problem focus on either application-layer data locality optimization to reduce network traffic or on network-layer data flow optimization to increase bandwidth utilization. However, the techniques in the two layers are totally independent from each other, and performance gains from a joint optimization perspective have not yet been explored. In this article, we propose a novel approach called NEAL (NEtwork-Aware Locality scheduling) to bridge this gap, and consequently to further reduce communication time for distributed big data operators. We present the detailed design and implementation of NEAL, and our experimental results demonstrate that NEAL always performs better than current approaches for different workloads and network bandwidth configurations.
更多
查看译文
关键词
Distributed databases,Bandwidth,Scheduling,Data centers,Optimization,Processor scheduling,Big Data,Data locality,coflow scheduling,distributed operators,data centers,big data,SDN,metaheuristic
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要