Hypothesis Generation with Large Language Models
arxiv(2024)
摘要
Effective generation of novel hypotheses is instrumental to scientific
progress. So far, researchers have been the main powerhouse behind hypothesis
generation by painstaking data analysis and thinking (also known as the Eureka
moment). In this paper, we examine the potential of large language models
(LLMs) to generate hypotheses. We focus on hypothesis generation based on data
(i.e., labeled examples). To enable LLMs to handle arbitrarily long contexts,
we generate initial hypotheses from a small number of examples and then update
them iteratively to improve the quality of hypotheses. Inspired by multi-armed
bandits, we design a reward function to inform the exploitation-exploration
tradeoff in the update process. Our algorithm is able to generate hypotheses
that enable much better predictive performance than few-shot prompting in
classification tasks, improving accuracy by 31.7
13.9
supervised learning by 12.8
Furthermore, we find that the generated hypotheses not only corroborate
human-verified theories but also uncover new insights for the tasks.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要