GERBIL: MPI+YARN

CCGRID(2015)

引用 16|浏览30
暂无评分
摘要
Emerging big data applications comprise rich multi-faceted workflows with both compute-intensive and data-intensive tasks, and intricate communication patterns. While MapReduce is an effective model for data-intensive tasks, the MPI programming model may be better suited for extracting high-performance for compute-intensive tasks. Researchers have recognized this need to employ specialized models for different phases of a workflow, e.g., performing computations using MPI followed by visualizations using MapReduce. However, extant multi-cluster approaches are inefficient as they entail data movement across clusters and porting across data formats. Consequently, there is a crucial need for disparate programming models to co-exist on the same set of resources. In this paper, we address the above issue by designing GERBIL, a framework for transparently co-hosting unmodified MPI applications alongside MapReduce applications on the same cluster. GERBIL exploits YARN as the model agnostic resource negotiator, and provides an easy-to-use interface to the users. GERBIL bridges the fundamental mismatch between YARN and MPI by designing an MPI-aware resource allocation mechanism. We also support five different optimizations: minimizing job wait time, achieving inter-process locality, achieving desired cluster utilization, minimizing network traffic, and minimizing job execution time, all in a multi-tenant environment. Our evaluation shows that GERBIL enables MPI executions with performance comparable to a native MPI setup, and improve compute-intensive applications performance by up to 133% when compared to the corresponding MapReduce-based versions.
更多
查看译文
关键词
Big Data,application program interfaces,data handling,message passing,parallel processing,resource allocation,GERBIL bridges,MPI applications,MPI programming model,MPI+YARN,MPI-aware resource allocation,MapReduce applications,big data applications,compute-intensive applications,compute-intensive tasks,data-intensive tasks,disparate programming models,high-performance tasks,model agnostic resource negotiator,multicluster approaches,multifaceted workflows,
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要