NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data
CoRR(2024)
摘要
To address the global issue of hateful content proliferating in online
platforms, hate speech detection (HSD) models are typically developed on
datasets collected in the United States, thereby failing to generalize to
English dialects from the Majority World. Furthermore, HSD models are often
evaluated on curated samples, raising concerns about overestimating model
performance in real-world settings. In this work, we introduce NaijaHate, the
first dataset annotated for HSD which contains a representative sample of
Nigerian tweets. We demonstrate that HSD evaluated on biased datasets
traditionally used in the literature largely overestimates real-world
performance on representative data. We also propose NaijaXLM-T, a pretrained
model tailored to the Nigerian Twitter context, and establish the key role
played by domain-adaptive pretraining and finetuning in maximizing HSD
performance. Finally, we show that in this context, a human-in-the-loop
approach to content moderation where humans review 1
flagged as hateful would enable to moderate 60
together, these results pave the way towards robust HSD systems and a better
protection of social media users from hateful content in low-resource settings.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要