An Iterative Scheme for Leverage-based Approximate Aggregation.

2019 IEEE 35th International Conference on Data Engineering (ICDE)(2019)

引用 7|浏览48
暂无评分
摘要
The current data explosion poses great challenges to approximate aggregation with high efficiency and accuracy. To address this problem, we propose a novel approach to calculate the aggregation answers with a high accuracy using only a small portion of the data. We introduce leverages to reflect individual differences in the data from a statistical perspective. Two kinds of estimators, the leverage-based estimator, and the sketch estimator (a "rough picture" of the aggregation answer), are in constraint relations and iteratively improved according to the actual conditions until their difference is below a threshold. Due to the iteration mechanism and the leverages, our approach achieves a high accuracy. Moreover, some features, such as not requiring recording the sampled data and easy to extend to various execution modes, such as the online mode, make our approach well suited to deal with big data. Experiments show that our approach has an extraordinary performance, and when compared with the uniform sampling, our approach can achieve high-quality answers with only 1/3 sample size.
更多
查看译文
关键词
Big Data,Modulation,Computer science,Distributed databases,Estimation,Explosions,Probability
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要