Practice Makes Perfect: Planning to Learn Skill Parameter Policies
CoRR(2024)
摘要
One promising approach towards effective robot decision making in complex,
long-horizon tasks is to sequence together parameterized skills. We consider a
setting where a robot is initially equipped with (1) a library of parameterized
skills, (2) an AI planner for sequencing together the skills given a goal, and
(3) a very general prior distribution for selecting skill parameters. Once
deployed, the robot should rapidly and autonomously learn to improve its
performance by specializing its skill parameter selection policy to the
particular objects, goals, and constraints in its environment. In this work, we
focus on the active learning problem of choosing which skills to practice to
maximize expected future task success. We propose that the robot should
estimate the competence of each skill, extrapolate the competence (asking: "how
much would the competence improve through practice?"), and situate the skill in
the task distribution through competence-aware planning. This approach is
implemented within a fully autonomous system where the robot repeatedly plans,
practices, and learns without any environment resets. Through experiments in
simulation, we find that our approach learns effective parameter policies more
sample-efficiently than several baselines. Experiments in the real-world
demonstrate our approach's ability to handle noise from perception and control
and improve the robot's ability to solve two long-horizon mobile-manipulation
tasks after a few hours of autonomous practice.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要