Large Language Models for Psycholinguistic Plausibility Pretesting
Conference of the European Chapter of the Association for Computational Linguistics(2024)
摘要
In psycholinguistics, the creation of controlled materials is crucial to
ensure that research outcomes are solely attributed to the intended
manipulations and not influenced by extraneous factors. To achieve this,
psycholinguists typically pretest linguistic materials, where a common pretest
is to solicit plausibility judgments from human evaluators on specific
sentences. In this work, we investigate whether Language Models (LMs) can be
used to generate these plausibility judgements. We investigate a wide range of
LMs across multiple linguistic structures and evaluate whether their
plausibility judgements correlate with human judgements. We find that GPT-4
plausibility judgements highly correlate with human judgements across the
structures we examine, whereas other LMs correlate well with humans on commonly
used syntactic structures. We then test whether this correlation implies that
LMs can be used instead of humans for pretesting. We find that when
coarse-grained plausibility judgements are needed, this works well, but when
fine-grained judgements are necessary, even GPT-4 does not provide satisfactory
discriminative power.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要