Selective Gradient Boosting for Effective Learning to Rank.

SIGIR(2018)

引用 17|浏览98
暂无评分
摘要
Learning an effective ranking function from a large number of query-document examples is a challenging task. Indeed, training sets where queries are associated with a few relevant documents and a large number of irrelevant ones are required to model real scenarios of Web search production systems, where a query can possibly retrieve thousands of matching documents, but only a few of them are actually relevant. In this paper, we propose Selective Gradient Boosting (SelGB), an algorithm addressing the Learning-to-Rank task by focusing on those irrelevant documents that are most likely to be mis-ranked, thus severely hindering the quality of the learned model. SelGB exploits a novel technique minimizing the mis-ranking risk, i.e., the probability that two randomly drawn instances are ranked incorrectly, within a gradient boosting process that iteratively generates an additive ensemble of decision trees. Specifically, at every iteration and on a per query basis, SelGB selectively chooses among the training instances a small sample of negative examples enhancing the discriminative power of the learned model. Reproducible and comprehensive experiments conducted on a publicly available dataset show that SelGB exploits the diversity and variety of the negative examples selected to train tree ensembles that outperform models generated by state-of-the-art algorithms by achieving improvements of [email protected] up to 3.2%.
更多
查看译文
关键词
Learning to Rank,Multiple Additive Regression Trees,Boosting
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要