The benefits, risks and bounds of personalizing the alignment of large language models to individuals

Hannah Rose Kirk,Bertie Vidgen,Paul Röttger,Scott A. Hale

Nature Machine Intelligence（2024）

引用 0|浏览19

暂无评分

摘要

Large language models (LLMs) undergo ‘alignment’ so that they better reflect human values or preferences, and are safer or more useful. However, alignment is intrinsically difficult because the hundreds of millions of people who now interact with LLMs have different preferences for language and conversational norms, operate under disparate value systems and hold diverse political beliefs. Typically, few developers or researchers dictate alignment norms, risking the exclusion or under-representation of various groups. Personalization is a new frontier in LLM development, whereby models are tailored to individuals. In principle, this could minimize cultural hegemony, enhance usefulness and broaden access. However, unbounded personalization poses risks such as large-scale profiling, privacy infringement, bias reinforcement and exploitation of the vulnerable. Defining the bounds of responsible and socially acceptable personalization is a non-trivial task beset with normative challenges. This article explores ‘personalized alignment’, whereby LLMs adapt to user-specific data, and highlights recent shifts in the LLM ecosystem towards a greater degree of personalization. Our main contribution explores the potential impact of personalized LLMs via a taxonomy of risks and benefits for individuals and society at large. We lastly discuss a key open question: what are appropriate bounds of personalization and who decides? Answering this normative question enables users to benefit from personalized alignment while safeguarding against harmful impacts for individuals and society. Tailoring the alignment of large language models (LLMs) to individuals is a new frontier in generative AI, but unbounded personalization can bring potential harm, such as large-scale profiling, privacy infringement and bias reinforcement. Kirk et al. develop a taxonomy for risks and benefits of personalized LLMs and discuss the need for normative decisions on what are acceptable bounds of personalization.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要