An Improved Dynamic Vertical Partitioning Technique for Semi-Structured Data

Sahel Sharify, Alan Lu,Jin Chen,Arnamoy Bhattacharyya,Ali Hashemi,Nick Koudas,Cristiana Amza

2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)（2019）

引用 6|浏览98

暂无评分

摘要

Semi-structured data such as JSON has become the de facto standard for supporting data exchange on the Web. At the same time, relational support for JSON data poses new challenges due to the large number of attributes, sparse attributes and dynamic changes in both workload and data set, which are all typical in such data. In this paper, we address these challenges through a lightweight, in-memory relational database engine prototype and a flexible vertical partitioning algorithm that uses simple heuristics to adapt the data layout for the workload, on the fly. Our experimental evaluation using the Nobench dataset for JSON data, shows that we outperform Argo, a state-of-the-art data model that also maps the JSON data format into relational databases, by a factor of 3. We also outperform Hyrise, a state-of-the-art vertical partitioning algorithm designed for in-memory databases, by 24%. Furthermore, our algorithm is able to achieve around 40% better cache utilization and 35% better TLB utilization. Our experiments also show that our partitioning algorithm adapts to workload changes within a few seconds.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要