Frequent Elements with Witnesses in Data Streams

International Conference on Management of Data(2021)

引用 2|浏览7
暂无评分
摘要
ABSTRACTDetecting frequent elements is among the oldest and most-studied problems in the area of data streams. Given a stream of m data items in \1, 2, \dots, n\, the objective is to output items that appear at least d times, for some threshold parameter d, and provably optimal algorithms are known today. However, in many applications, knowing only the frequent elements themselves is not enough: For example, an Internet router may not only need to know the most frequent destination IP addresses of forwarded packages, but also the timestamps of when these packages appeared or any other meta-data that "arrived'' with the packages, e.g., their source IP addresses. In this paper, we introduce the witness version of the frequent elements problem: Given a desired approximation guarantee α \ge 1$ and a desired frequency $d łe Δ$, where Δ is the frequency of the most frequent item, the objective is to report an item together with at least $d / α$ timestamps of when the item appeared in the stream (or any other meta-data that arrived with the items). We give provably optimal algorithms for both the insertion-only and insertion-deletion stream settings: In insertion-only streams, we show that space $\tildeO (n + d \cdot n^\frac1 α )$ is necessary and sufficient for every integral $1 łe α łe łog n$. In insertion-deletion streams, we show that space $\tildeO (\fracn \cdot d α^2 )$ is necessary and sufficient, for every α łe \sqrtn $.
更多
查看译文
关键词
frequent elements, heavy hitters, data streams, algorithms, lower bounds
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要