Improving I/O Performance With Adaptive Data Compression For Big Data Applications

Hongbo Zou,Yongen Yu,Wei Tang,Hsuanwei Michelle Chen

2014 IEEE International Parallel & Distributed Processing Symposium Workshops（2014）

Cited 29|Views14

No score

Abstract

Increasingly larger scale simulations are generating an unprecedented amount of data. However, the increasing gap between computation and I/O capacity on High End Computing machines makes a severe bottleneck for data analysis. As a solution, in-situ analytics processes output data while simulations are running and before placing data on disk. Data movement between simulation and analytics, however, incurs overheads of in-situ analytics at scale. This paper tries to answer the following question: can we use compression technology to reduce the data movement cost and improve the performance of in-situ analytics for peta-scale applications? In particular, we explore when, where, how to use the compression techniques to reduce data movement cost between simulation and analytics. To find out the best algorithm and place to compress data in given situation, we introduce an adaptive data compression algorithm in this paper. The adaptive compression service is developed and analyzed for the in-situ analytics middleware. Experimental results demonstrate that compression service increases data transition bandwidth and improve the application End-to-End transfer performance.

Translated text

Key words

I/O Bottlenecks,In-situ Analytics,Compression,Big Data,High-end Computing

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined