MBBQ: A Dataset for Cross-Lingual Comparison of Stereotypes in Generative LLMs
CoRR(2024)
摘要
Generative large language models (LLMs) have been shown to exhibit harmful
biases and stereotypes. While safety fine-tuning typically takes place in
English, if at all, these models are being used by speakers of many different
languages. There is existing evidence that the performance of these models is
inconsistent across languages and that they discriminate based on demographic
factors of the user. Motivated by this, we investigate whether the social
stereotypes exhibited by LLMs differ as a function of the language used to
prompt them, while controlling for cultural differences and task accuracy. To
this end, we present MBBQ (Multilingual Bias Benchmark for Question-answering),
a carefully curated version of the English BBQ dataset extended to Dutch,
Spanish, and Turkish, which measures stereotypes commonly held across these
languages. We further complement MBBQ with a parallel control dataset to
measure task performance on the question-answering task independently of bias.
Our results based on several open-source and proprietary LLMs confirm that some
non-English languages suffer from bias more than English, even when controlling
for cultural shifts. Moreover, we observe significant cross-lingual differences
in bias behaviour for all except the most accurate models. With the release of
MBBQ, we hope to encourage further research on bias in multilingual settings.
The dataset and code are available at https://github.com/Veranep/MBBQ.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要