Demonstration-Guided Q-Learning

Ikechukwu Uchendu,Ted Xiao,Yao Lu,Mengyuan Yan, Joséphine Simon,Matthew Bennice,Chuyuan Fu,Karol Hausman

semanticscholar（2021）

引用 0|浏览7

暂无评分

摘要

In many challenging reinforcement learning (RL) settings, demonstrations are used to assist with exploration by allowing policies or value functions to directly learn from successful experience. In this work, we explore additional ways to utilize expert demonstrations to expedite training in value-based RL. In particular, we propose Demonstration-Guided Q-Learning (DGQL), an algorithm that noisily replays expert demonstrations to guide exploration and enable more efficient Q-value propagation in value-based RL methods. Contrary to common methods that utilize demonstrations in the context of value-based RL, we show that DGQL effectively leverages demonstrations to guide exploration via a replaying curriculum that relaxes common assumptions in simulated environments. In addition to analyzing the empirical benefits of more efficient value propagation, we show that DGQL is able to scale to difficult vision-based robotic manipulation tasks.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要