NewsQs: Multi-Source Question Generation for the Inquiring Mind
CoRR(2024)
摘要
We present NewsQs (news-cues), a dataset that provides question-answer pairs
for multiple news documents. To create NewsQs, we augment a traditional
multi-document summarization dataset with questions automatically generated by
a T5-Large model fine-tuned on FAQ-style news articles from the News On the Web
corpus. We show that fine-tuning a model with control codes produces questions
that are judged acceptable more often than the same model without them as
measured through human evaluation. We use a QNLI model with high correlation
with human annotations to filter our data. We release our final dataset of
high-quality questions, answers, and document clusters as a resource for future
work in query-based multi-document summarization.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要