CORE-Sketch: On Exact Computation of Median Absolute Deviation with Limited Space.

Haoquan Guan, Ziling Chen,Shaoxu Song

Proc. VLDB Endow.(2023)

引用 0|浏览6
暂无评分
摘要
Median absolute deviation (MAD), the median of the absolute deviations from the median, has been found useful in various applications such as outlier detection. Together with median, MAD is more robust to abnormal data than mean and standard deviation (SD). Unfortunately, existing methods return only approximate MAD that may be far from the exact one, and thus mislead the downstream applications. Computing exact MAD is costly, however, especially in space, by storing the entire dataset in memory. In this paper, we propose COnstruction-REfinement Sketch (CORE-Sketch) for computing exact MAD. The idea is to construct some sketch within limited space, and gradually refine the sketch to find the MAD element, i.e., the element with distance to the median exactly equal to MAD. Mergeability and convergence of the method is analyzed, ensuring the correctness of the proposal and enabling parallel computation. Extensive experiments demonstrate that CORE-Sketch achieves significantly less space occupation compared to the aforesaid baseline of No-Sketch, and has time and space costs relatively comparable to the DD-Sketch method for approximate MAD.
更多
查看译文
关键词
median absolute deviation,exact computation,core-sketch
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要