PeriodicSketch: Finding Periodic Items in Data Streams

2022 IEEE 38th International Conference on Data Engineering (ICDE)(2022)

引用 5|浏览16
暂无评分
摘要
In this paper, we study periodic items in data streams, which refer to those items arriving with a fixed interval. All existing works involving mining periodic patterns does not fit for data stream scenarios. To find periodic items in real time, we propose a novel sketch, PeriodicSketch, aiming to accurately record top- $K$ periodic items. To the best of our knowledge, this is the first work to find periodic items in data streams. Any interval may occur many times, and we use frequency to denote the number of an interval occurred. To pick out periodic items with high frequency, we propose a key technique called Guaranteed Soft Uniform (GSU) replacement strategy. Our theoretical proofs show that when replacement is successful, it is more likely that the new item has a higher frequency than the current smallest frequency; and GSU can ensure that our items in the sketch will approach the true periodic items closer and closer. And as soon as we get all the periodic items, the state would not change worse with high probability. We conduct extensive experiments, and the experimental results show that the Average Absolute Error (AAE) of our sketch using 1/10 memory is around 737 times (up to 2019 times) lower than the baseline solution. Finally, we provide a concrete case: Cache prefetch, which proves that PeriodicSketch can significantly improve the Cache hit ratio. All related codes of PeriodicSketch are open-sourced and available at GitHub [1].
更多
查看译文
关键词
data streams,top-K periodic items,PeriodicSketch,GSU replacement strategy,guaranteed soft uniform,average absolute error,AAE,cache prefetch,periodic pattern mining,cache hit ratio
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要