Is Learning World Model Always Beneficial For Reinforcement Learning?

user-60f947d94c775efc5de23468（2021）

引用 0|浏览39

暂无评分

摘要

We propose a hypothesis in model-based reinforcement learning (MBRL): the RL agent can learn to solve tasks faster by learning to interact with a learned world model and exploit the imperfect information about the environment. We develop two different architectures to evaluate this hypothesis. We show that the policy with access to such information outperforms the standalone policy on toy benchmarks. The results suggest that this is a promising revenue of research towards efficient MBRL algorithms that do not rely on rollouts.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要