Dremel: interactive analysis of web-scale datasets

COMMUNICATIONS OF THE ACM(2011)

引用 229|浏览2
暂无评分
摘要
Dremel is a scalable, interactive ad-hoc query system for analysis of read-only nested data. By combining multi-level execution trees and columnar data layout, it is capable of running aggregation queries over trillion-row tables in seconds. The system scales to thousands of CPUs and petabytes of data, and has thousands of users at Google. In this paper, we describe the architecture and implementation of Dremel, and explain how it complements MapReduce-based computing. We present a novel columnar storage representation for nested records and discuss experiments on few-thousand node instances of the system.
更多
查看译文
关键词
columnar data layout,interactive ad-hoc query system,read-only nested data,system scale,nested record,novel columnar storage representation,MapReduce-based computing,aggregation query,few-thousand node instance,multi-level execution tree,interactive analysis,web-scale datasets
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要