TICC: Transparent Inter-Column Compression for Column-Oriented Database Systems.

CIKM(2017)

引用 2|浏览28
暂无评分
摘要
In this paper, we present TICC, an automatic data compression component that can transparently eliminate data redundancies across columns in column-oriented database systems. We further propose two approaches to integrate inter-column compression into existing database systems. One approach is to use User Defined Functions (UDFs), and the other is native. We implement these two approaches on top of Hive based on the ORC file, a common data format in column stores, and evaluate the performance of TICC using real-world datasets. The experimental results demonstrate that TICC can significantly reduce the storage overhead and process a variety of queries over large-scale data with up to 20% performance improvement over the original Hive.
更多
查看译文
关键词
Data compression, Cross-column redundancy, column store
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要