Online Burst Detection Over High Speed Short Text Streams

COMPUTATIONAL SCIENCE - ICCS 2007, PT 3, PROCEEDINGS(2007)

引用 15|浏览0
暂无评分
摘要
Burst detection is an inherent problem for data streams and it has attracted extensive attention in research community due to its broad applications. In this paper, an integrated approach is introduced to solve burst events detection problem over high speed short text streams. First, we simplify the requirement by considering burst event as a set of burst features, then the processing speed can be accelerated and multiple features can be identified simultaneously. Second, by using the ratio of the number of documents with specific feature and total number of documents during a period of time as the measurement, our solution adapts to any kind of data distribution. Then we propose two algorithms to maintain the ratio in the sliding window. Finally, we propose a burst detection algorithm based on Ratio Aggregation Pyramid (RAP) and Slope Pyramid (SP) data structure, which are extended from Aggregation Pyramid (AP). Our algorithm can detect burst in multiple window sizes simultaneously and is parameter-free. Theoretical analysis and experimental results verify the availability, efficiency and scalability of our method.
更多
查看译文
关键词
slope pyramid,burst events detection problem,aggregation pyramid,burst detection,online burst detection,high speed,data distribution,short text streams,burst event,ratio aggregation pyramid,burst detection algorithm,burst feature,data stream,sliding window,data structure
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要