Augmented Sketch: Faster and More Accurate Stream Processing.

SIGMOD/PODS'16: International Conference on Management of Data San Francisco California USA June, 2016(2016)

引用 161|浏览71
暂无评分
摘要
Approximated algorithms are often used to estimate the frequency of items on high volume, fast data streams. The most common ones are variations of Count-Min sketch, which use sub-linear space for the count, but can produce errors in the counts of the most frequent items and can misclassify low-frequency items. In this paper, we improve the accuracy of sketch-based algorithms by increasing the frequency estimation accuracy of the most frequent items and reducing the possible misclassification of low-frequency items, while also improving the overall throughput. Our solution, called Augmented Sketch (ASketch), is based on a pre-filtering stage that dynamically identifies and aggregates the most frequent items. Items overflowing the pre-filtering stage are processed using a conventional sketch algorithm, thereby making the solution general and applicable in a wide range of contexts. The pre-filtering stage can be efficiently implemented with SIMD instructions on multi-core machines and can be further parallelized through pipeline parallelism where the filtering stage runs in one core and the sketch algorithm runs in another core.
更多
查看译文
关键词
data streams,sketch,approximated algorithms,data structures,stream summary
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要