Self-DC: When to retrieve and When to generate? Self Divide-and-Conquer for Compositional Unknown Questions
CoRR(2024)
摘要
Retrieve-then-read and generate-then-read are two typical solutions to handle
unknown and known questions in open-domain question-answering, while the former
retrieves necessary external knowledge and the later prompt the large language
models to generate internal known knowledge encoded in the parameters. However,
few of previous works consider the compositional unknown questions, which
consist of several known or unknown sub-questions. Thus, simple binary
classification (known or unknown) becomes sub-optimal and inefficient since it
will call external retrieval excessively for each compositional unknown
question. To this end, we propose the first Compositional unknown
Question-Answering dataset (CuQA), and introduce a Self Divide-and-Conquer
(Self-DC) framework to empower LLMs to adaptively call different methods
on-demand, resulting in better performance and efficiency. Experimental results
on two datasets (CuQA and FreshQA) demonstrate that Self-DC can achieve
comparable or even better performance with much more less retrieval times
compared with several strong baselines.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要