Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models
CoRR(2024)
摘要
Much recent work seeks to evaluate values and opinions in large language
models (LLMs) using multiple-choice surveys and questionnaires. Most of this
work is motivated by concerns around real-world LLM applications. For example,
politically-biased LLMs may subtly influence society when they are used by
millions of people. Such real-world concerns, however, stand in stark contrast
to the artificiality of current evaluations: real users do not typically ask
LLMs survey questions. Motivated by this discrepancy, we challenge the
prevailing constrained evaluation paradigm for values and opinions in LLMs and
explore more realistic unconstrained evaluations. As a case study, we focus on
the popular Political Compass Test (PCT). In a systematic review, we find that
most prior work using the PCT forces models to comply with the PCT's
multiple-choice format. We show that models give substantively different
answers when not forced; that answers change depending on how models are
forced; and that answers lack paraphrase robustness. Then, we demonstrate that
models give different answers yet again in a more realistic open-ended answer
setting. We distill these findings into recommendations and open challenges in
evaluating values and opinions in LLMs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要