Hierarchical Prototypes for Unsupervised Dynamics Generalization in Model-Based Reinforcement Learning

Jiaxian Guo,Mingming Gong,Yali Du,Zhen Wang,Dacheng Tao

ICLR 2023（2023）

引用 0|浏览31

暂无评分

摘要

By incorporating the environment-specific factor into the dynamics prediction, model-based reinforcement learning (MBRL) is able to generalise to environments with diverse dynamics.In the majority of real-world scenarios, the environment-specific factor is not observable, so existing methods attempt to estimate it from historical transition segments. Nevertheless,earlier research was unable to identify distinct clusters for environment-specific factors learned from different environments, resulting in poor performance. To address this issue, We introduce a set of environmental prototypes to represent the environmental-specified representation for each environment. By encouraging learned environment-specific factors to resemble their assigned environmental prototypes more closely, the discrimination between factors estimated from distinct environments will be enhanced. To learn such prototypes, we first construct prototypes for each sampled trajectory and then hierarchically combine trajectory prototypes with similar semantics into one environmental prototype. Experiments demonstrate that environment-specific factors estimated by our method have superior clustering performance and can consistently improve MBRL's generalisation performance in six environments consistently.

查看译文

关键词

Unsupervised Dynamics Generalization,Model-Based Reinforcement Learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要