Learnable Tokenizer for LLM-based Generative Recommendation
arxiv(2024)
摘要
Harnessing Large Language Models (LLMs) for generative recommendation has
garnered significant attention due to LLMs' powerful capacities such as rich
world knowledge and reasoning. However, a critical challenge lies in
transforming recommendation data into the language space of LLMs through
effective item tokenization. Existing approaches, such as ID identifiers,
textual identifiers, and codebook-based identifiers, exhibit limitations in
encoding semantic information, incorporating collaborative signals, or handling
code assignment bias. To address these shortcomings, we propose LETTER (a
LEarnable Tokenizer for generaTivE Recommendation), designed to meet the key
criteria of identifiers by integrating hierarchical semantics, collaborative
signals, and code assignment diversity. LETTER integrates Residual Quantized
VAE for semantic regularization, a contrastive alignment loss for collaborative
regularization, and a diversity loss to mitigate code assignment bias. We
instantiate LETTER within two generative recommender models and introduce a
ranking-guided generation loss to enhance their ranking ability. Extensive
experiments across three datasets demonstrate the superiority of LETTER in item
tokenization, thereby advancing the state-of-the-art in the field of generative
recommendation.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要