Improving I/O Performance With Adaptive Data Compression For Big Data Applications

2014 IEEE International Parallel & Distributed Processing Symposium Workshops(2014)

Cited 29|Views14
No score
Abstract
Increasingly larger scale simulations are generating an unprecedented amount of data. However, the increasing gap between computation and I/O capacity on High End Computing machines makes a severe bottleneck for data analysis. As a solution, in-situ analytics processes output data while simulations are running and before placing data on disk. Data movement between simulation and analytics, however, incurs overheads of in-situ analytics at scale. This paper tries to answer the following question: can we use compression technology to reduce the data movement cost and improve the performance of in-situ analytics for peta-scale applications? In particular, we explore when, where, how to use the compression techniques to reduce data movement cost between simulation and analytics. To find out the best algorithm and place to compress data in given situation, we introduce an adaptive data compression algorithm in this paper. The adaptive compression service is developed and analyzed for the in-situ analytics middleware. Experimental results demonstrate that compression service increases data transition bandwidth and improve the application End-to-End transfer performance.
More
Translated text
Key words
I/O Bottlenecks,In-situ Analytics,Compression,Big Data,High-end Computing
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined