Dialect prejudice predicts AI decisions about people's character, employability, and criminality
arxiv(2024)
摘要
Hundreds of millions of people now interact with language models, with uses
ranging from serving as a writing aid to informing hiring decisions. Yet these
language models are known to perpetuate systematic racial prejudices, making
their judgments biased in problematic ways about groups like African Americans.
While prior research has focused on overt racism in language models, social
scientists have argued that racism with a more subtle character has developed
over time. It is unknown whether this covert racism manifests in language
models. Here, we demonstrate that language models embody covert racism in the
form of dialect prejudice: we extend research showing that Americans hold
raciolinguistic stereotypes about speakers of African American English and find
that language models have the same prejudice, exhibiting covert stereotypes
that are more negative than any human stereotypes about African Americans ever
experimentally recorded, although closest to the ones from before the civil
rights movement. By contrast, the language models' overt stereotypes about
African Americans are much more positive. We demonstrate that dialect prejudice
has the potential for harmful consequences by asking language models to make
hypothetical decisions about people, based only on how they speak. Language
models are more likely to suggest that speakers of African American English be
assigned less prestigious jobs, be convicted of crimes, and be sentenced to
death. Finally, we show that existing methods for alleviating racial bias in
language models such as human feedback training do not mitigate the dialect
prejudice, but can exacerbate the discrepancy between covert and overt
stereotypes, by teaching language models to superficially conceal the racism
that they maintain on a deeper level. Our findings have far-reaching
implications for the fair and safe employment of language technology.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要