Faster Dynamic Pruning via Reordering of Documents in Inverted Indexes

PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023(2023)

引用 0|浏览1
暂无评分
摘要
Widely used dynamic pruning algorithms (such as MaxScore, WAND and BMW) keep track of the k-th highest score (i.e., heap threshold) among the documents that are scored so far, to avoid scoring the documents that cannot get into the top-k result list. Obviously, the faster the heap threshold converges to its final value, the larger will be the number of skipped documents and hence, the efficiency gains of the pruning algorithms. In this paper, we tailor approaches that reorder the documents in the inverted index based on their access counts and ranks for previous queries. By storing such frequently retrieved documents at front of the postings lists, we aim to compute the heap threshold earlier during the query processing. Our approach yields substantial speedups (up to 1.33x) for all three dynamic pruning algorithms and outperforms two strong baselines that have been employed for document reordering in the literature.
更多
查看译文
关键词
inverted index,search engines,dynamic pruning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要