Building Open-Ended Embodied Agent via Language-Policy Bidirectional Adaptation
CoRR(2023)
摘要
Building open-ended learning agents involves challenges in pre-trained
language model (LLM) and reinforcement learning (RL) approaches. LLMs struggle
with context-specific real-time interactions, while RL methods face efficiency
issues for exploration. To this end, we propose OpenContra, a co-training
framework that cooperates LLMs and GRL to construct an open-ended agent capable
of comprehending arbitrary human instructions. The implementation comprises two
stages: (1) fine-tuning an LLM to translate human instructions into structured
goals, and curriculum training a goal-conditioned RL policy to execute
arbitrary goals; (2) collaborative training to make the LLM and RL policy learn
to adapt each, achieving open-endedness on instruction space. We conduct
experiments on Contra, a battle royale FPS game with a complex and vast goal
space. The results show that an agent trained with OpenContra comprehends
arbitrary human instructions and completes goals with a high completion ratio,
which proves that OpenContra may be the first practical solution for
constructing open-ended embodied agents.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要