Fast Heuristics for Near-Optimal Task Allocation in Data Stream Processing over Clusters.
CIKM '14: 2014 ACM Conference on Information and Knowledge Management Shanghai China November, 2014(2014)
摘要
We study provisioning and job reconfiguration techniques for adapting to execution environment changes when processing data streams on cluster-based deployments. By monitoring the performance of an executing job, we identify computation and communication bottlenecks. In such cases we reconfigure the job by reallocating its tasks to minimize the communication cost. Our work targets data-intensive applications where the inter-node transfer latency is significant. We aim to minimize the transfer latency while keeping the nodes below some computational load threshold. We propose a scalable centralized scheme that employs fast allocation heuristics. Our techniques are based on a general group-based job representation that is commonly found in many distributed data stream processing frameworks. Using this representation we devise linear-time task allocation algorithms that improve existing quadratic-time solutions in practical cases. We have implemented and evaluated our proposals using both synthetic and real-world scenarios. Our results show that our algorithms: (a) exhibit significant allocation throughput while producing near-optimal allocations, and (b) significantly improve existing task-level approaches.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络