Measuring the Search Effectiveness of a Breadth-First Crawl

ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS(2009)

引用 7|浏览0
暂无评分
摘要
Previous scalability experiments found that early precision improves as collection size increases. However, that was under the assumption that a collection's documents are all sampled with uniform probability from the same population. We contrast this to a large breadth-first web crawl, an important scenario in real-world Web search, where the early documents have quite different characteristics from the later documents. Having observed that NDCG@100 (measured over a set of reference queries) begins to plateau in the initial stages of the crawl, we investigate a number of possible reasons for this behaviour. These include the web-pages themselves, the metric used to measure retrieval effectiveness as well as the set of relevance judgements used.
更多
查看译文
关键词
possible reason,previous scalability experiment,search effectiveness,breadth-first crawl,collection size increase,later document,initial stage,important scenario,different characteristic,large breadth-first web crawl,early precision,early document
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要