Community Training: Partitioning Schemes in Good Shape for Federated Data Grids

Tobias Scholl,Richard Kuntschke,Angelika Reiser,Alfons Kemper

Bangalore（2007）

引用 12|浏览2

暂无评分

摘要

In federated Data Grids, individual institutions share their data sets within a community to enable collaborative data analysis. Data access needs to be provided in a scalable fashion since in most e-science communities, data sets do not only grow exponentially but also experience an increasing popularity. If data autonomy is retained, each individual institution has to ensure efficient access to its data. Analyzing application-specific data properties (such as data skew) or query characteristics (query patterns) and distributing data within Data Grids accordingly, allows for improved throughput for data-intensive applications and enables better load-balancing between shared resources. We propose a framework for investigating application-specific index structures for creating suitable partitioning schemes. We evaluate two variants of the well-known Quadtree data structure as well as the Zones approach, an index structure from the astrophysics domain, according to several criteria. Our framework improves data access within federated Data Grids and can be combined with well-established Grid methods as well as with more flexible P2P technologies.

查看译文

关键词

data skew,good shape,application-specific data property,collaborative data analysis,data access,partitioning schemes,federated data grids,data autonomy,well-known quadtree data structure,efficient access,community training,individual institution,p2p,data analysis,resource allocation,data grid,load balance,resource sharing,data structure,database indexing,distributed databases,groupware,science communication,grid computing,load balancing

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要