GoldFinger: Fast & Approximate Jaccard for Efficient KNN Graph Constructions
IEEE Transactions on Knowledge and Data Engineering(2023)
摘要
We propose
GoldFinger
, a new
compact
and
fast-to-compute
binary representation of datasets to approximate Jaccard's index. We illustrate the effectiveness of GoldFinger on the emblematic big data problem of K-Nearest-Neighbor (KNN) graph construction and show that GoldFinger can drastically accelerate a large range of existing KNN algorithms with little to no overhead. As a side effect, we also show that the compact representation of the data protects users’ privacy
for free
by providing
k
-anonymity and
l
-diversity. Our extensive evaluation of the resulting approach on several realistic datasets shows that our approach reduces computation times by up to 78.9% compared to raw data while only incurring a negligible to moderate loss in terms of KNN quality. We also show that GoldFinger can be applied to KNN queries (a widely-used search technique) and delivers speedups of up to
$\times 3.55$
over one of the most efficient approaches to this problem.
更多查看译文
关键词
KNN graphs, fingerprint, similarity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要