How Reliable is Your Simulator? Analysis on the Limitations of Current LLM-based User Simulators for Conversational Recommendation
Companion Proceedings of the ACM on Web Conference 2024(2024)
Abstract
Conversational Recommender System (CRS) interacts with users through natural
language to understand their preferences and provide personalized
recommendations in real-time. CRS has demonstrated significant potential,
prompting researchers to address the development of more realistic and reliable
user simulators as a key focus. Recently, the capabilities of Large Language
Models (LLMs) have attracted a lot of attention in various fields.
Simultaneously, efforts are underway to construct user simulators based on
LLMs. While these works showcase innovation, they also come with certain
limitations that require attention. In this work, we aim to analyze the
limitations of using LLMs in constructing user simulators for CRS, to guide
future research. To achieve this goal, we conduct analytical validation on the
notable work, iEvaLM. Through multiple experiments on two widely-used datasets
in the field of conversational recommendation, we highlight several issues with
the current evaluation methods for user simulators based on LLMs: (1) Data
leakage, which occurs in conversational history and the user simulator's
replies, results in inflated evaluation results. (2) The success of CRS
recommendations depends more on the availability and quality of conversational
history than on the responses from user simulators. (3) Controlling the output
of the user simulator through a single prompt template proves challenging. To
overcome these limitations, we propose SimpleUserSim, employing a
straightforward strategy to guide the topic toward the target items. Our study
validates the ability of CRS models to utilize the interaction information,
significantly improving the recommendation results.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined