Multimodal Pretraining, Adaptation, and Generation for Recommendation: A Survey
CoRR(2024)
摘要
Personalized recommendation serves as a ubiquitous channel for users to
discover information or items tailored to their interests. However, traditional
recommendation models primarily rely on unique IDs and categorical features for
user-item matching, potentially overlooking the nuanced essence of raw item
contents across multiple modalities such as text, image, audio, and video. This
underutilization of multimodal data poses a limitation to recommender systems,
especially in multimedia services like news, music, and short-video platforms.
The recent advancements in pretrained multimodal models offer new opportunities
and challenges in developing content-aware recommender systems. This survey
seeks to provide a comprehensive exploration of the latest advancements and
future trajectories in multimodal pretraining, adaptation, and generation
techniques, as well as their applications to recommender systems. Furthermore,
we discuss open challenges and opportunities for future research in this
domain. We hope that this survey, along with our tutorial materials, will
inspire further research efforts to advance this evolving landscape.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要