G-Store: High-Performance Graph Store For Trillion-Edge Processing

IEEE International Conference on High Performance Computing, Data, and Analytics(2016)

引用 72|浏览72
暂无评分
摘要
High-performance graph processing brings great benefits to a wide range of scientific applications, e.g., biology networks, recommendation systems, and social networks, where such graphs have grown to terabytes of data with billions of vertices and trillions of edges. Subsequently, storage performance plays a critical role in designing a high-performance computer system for graph analytics. In this paper, we present G-Store, a new graph store that incorporates three techniques to accelerate the I/O and computation of graph algorithms. First, G-Store develops a space-efficient tile format for graph data, which takes advantage of the symmetry present in graphs as well as a new smallest number of bits representation. Second, G-Store utilizes tile-based physical grouping on disks so that multi-core CPUs can achieve high cache and memory performance and fully utilize the throughput from an array of solid-state disks. Third, G-Store employs a novel slide-cache-rewind strategy to pipeline graph I/O and computing. With a modest amount of memory, G-Store utilizes a proactive caching strategy in the system so that all fetched graph data are fully utilized before evicted from memory. We evaluate G-Store on a number of graphs against two stateof-the-art graph engines and show that G-Store achieves 2 to 8x saving in storage and outperforms both by 2 to 32x. G-Store is able to run different algorithms on trillion-edge graphs within tens of minutes, setting a new milestone in semi-external graph processing system.
更多
查看译文
关键词
G-Store,high-performance graph store,trillion-edge processing,high-performance graph processing,scientific applications,storage performance,high-performance computer system,graph analytics,graph algorithms,tile format,bits representation,tile-based physical grouping,multicore CPU,cache performance,memory performance,solid-state disks,slide-cache-rewind strategy,graph I/O,proactive caching,graph processing system
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要