Listwise Generative Retrieval Models via a Sequential Learning Process
ACM Transactions on Information Systems(2024)
摘要
Recently, a novel generative retrieval (GR) paradigm has been proposed, where
a single sequence-to-sequence model is learned to directly generate a list of
relevant document identifiers (docids) given a query. Existing GR models
commonly employ maximum likelihood estimation (MLE) for optimization: this
involves maximizing the likelihood of a single relevant docid given an input
query, with the assumption that the likelihood for each docid is independent of
the other docids in the list. We refer to these models as the pointwise
approach in this paper. While the pointwise approach has been shown to be
effective in the context of GR, it is considered sub-optimal due to its
disregard for the fundamental principle that ranking involves making
predictions about lists. In this paper, we address this limitation by
introducing an alternative listwise approach, which empowers the GR model to
optimize the relevance at the docid list level. Specifically, we view the
generation of a ranked docid list as a sequence learning process: at each step
we learn a subset of parameters that maximizes the corresponding generation
likelihood of the i-th docid given the (preceding) top i-1 docids. To
formalize the sequence learning process, we design a positional conditional
probability for GR. To alleviate the potential impact of beam search on the
generation quality during inference, we perform relevance calibration on the
generation likelihood of model-generated docids according to relevance grades.
We conduct extensive experiments on representative binary and multi-graded
relevance datasets. Our empirical results demonstrate that our method
outperforms state-of-the-art GR baselines in terms of retrieval performance.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要