Reducing Chunk Fragmentation for In-Line Delta Compressed and Deduplicated Backup Systems

2017 International Conference on Networking, Architecture, and Storage (NAS)(2017)

引用 6|浏览110
暂无评分
摘要
Chunk-level deduplication, while robust in removing duplicate chunks, introduces chunk fragmentation which decreases restore performance. Rewriting algorithms are proposed to reduce the chunk fragmentation and accelerate the restore speed. Delta compression can remove redundant data between non-duplicate but similar chunks which cannot be eliminated by chunk-level deduplication. Some applications use delta compression as a complement for chunk-level deduplication to attain extra space and bandwidth savings. However, we observe that delta compression introduces a new type of chunk fragmentation stemming from delta compressed chunks whose base chunks are fragmented. We refer to such delta compressed chunks as base-fragmented chunks. We found that this new type of chunk fragmentation has a more severely impact on the restore performance than the chunk fragmentation introduced by chunk-level deduplication and cannot be reduced by existing rewriting algorithms. In order to address the problem due to the base-fragmented chunks, we propose SDC, a scheme that selectively performs delta compression after chunk-level deduplication. The main idea behind SDC is to simulate a restore cache to identify the non-base-fragmented chunks and only perform delta compression for these chunks, thus avoiding the new type of chunk fragmentation. Due to the locality among the backup streams, most of the non-base-fragmented chunks can be detected by the simulated restore cache. Experimental results based on real-world datasets show that SDC improves the restore performance of the delta compressed and deduplicated backup system by 1.93X-7.48X, and achieves 95.5%-97.4% of its compression, while imposing negligible impact on the backup throughput.
更多
查看译文
关键词
simulated restore cache,base-fragmented chunks,data redundancy,delta compression,rewriting algorithms,chunk-level deduplication,deduplicated backup systems,in-line delta compressed systems,chunk fragmentation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要