Dataflow Processing and Optimization on Grid and Cloud Infrastructures

IEEE Data Eng. Bull.(2009)

引用 50|浏览48
暂无评分
摘要
Complex on-demand data retrieval and processing is a charac teristic of several applications and com- bines the notions of querying & search, information filterin g & retrieval, data transformation & analysis, and other data manipulations. Such rich tasks are typicallyrepresented by data processing graphs, hav- ing arbitrary data operators as nodes and their producer-co nsumer interactions as edges. Optimizing and executing such graphs on top of distributed architectur es is critical for the success of the corre- sponding applications and presents several algorithmic an d systemic challenges. This paper describes a system under development that offers such functionality o n top of Ad-hoc Clusters, Grids, or Clouds. Operators may be user defined, so their algebraic and other pr operties as well as those of the data they produce are specified in associated profiles. Optimization is based on these profiles, must satisfy a vari- ety of objectives and constraints, and takes into account th e particular characteristics of the underlying architecture, mapping high-level dataflow semantics to flexible runtime structures. The paper highlights the key components of the system and outlines the major direc tions of its development.
更多
查看译文
关键词
data transformation,data retrieval,distributed architecture,satisfiability,data processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要