Content Conditional Debiasing for Fair Text Embedding
CoRR(2024)
摘要
Mitigating biases in machine learning models has gained increasing attention
in Natural Language Processing (NLP). Yet, only a few studies focus on fair
text embeddings, which are crucial yet challenging for real-world applications.
In this paper, we propose a novel method for learning fair text embeddings. We
achieve fairness while maintaining utility trade-off by ensuring conditional
independence between sensitive attributes and text embeddings conditioned on
the content. Specifically, we enforce that embeddings of texts with different
sensitive attributes but identical content maintain the same distance toward
the embedding of their corresponding neutral text. Furthermore, we address the
issue of lacking proper training data by using Large Language Models (LLMs) to
augment texts into different sensitive groups. Our extensive evaluations
demonstrate that our approach effectively improves fairness while preserving
the utility of embeddings, representing a pioneering effort in achieving
conditional independence for fair text embeddings.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要