Language Models for Text Classification: Is In-Context Learning Enough?
CoRR(2024)
摘要
Recent foundational language models have shown state-of-the-art performance
in many NLP tasks in zero- and few-shot settings. An advantage of these models
over more standard approaches based on fine-tuning is the ability to understand
instructions written in natural language (prompts), which helps them generalise
better to different tasks and domains without the need for specific training
data. This makes them suitable for addressing text classification problems for
domains with limited amounts of annotated instances. However, existing research
is limited in scale and lacks understanding of how text generation models
combined with prompting techniques compare to more established methods for text
classification such as fine-tuning masked language models. In this paper, we
address this research gap by performing a large-scale evaluation study for 16
text classification datasets covering binary, multiclass, and multilabel
problems. In particular, we compare zero- and few-shot approaches of large
language models to fine-tuning smaller language models. We also analyse the
results by prompt, classification type, domain, and number of labels. In
general, the results show how fine-tuning smaller and more efficient language
models can still outperform few-shot approaches of larger language models,
which have room for improvement when it comes to text classification.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要