Exploration in relational domains for model-based reinforcement learning

Tobias Lang,Marc Toussaint,Kristian Kersting

Journal of Machine Learning Research（2012）

引用 74|浏览37

暂无评分

摘要

A fundamental problem in reinforcement learning is balancing exploration and exploitation. We address this problem in the context of model-based reinforcement learning in large stochastic relational domains by developing relational extensions of the concepts of the E3 and R-MAX algorithms. Efficient exploration in exponentially large state spaces needs to exploit the generalization of the learned model: what in a propositional setting would be considered a novel situation and worth exploration may in the relational setting be a well-known context in which exploitation is promising. To address this we introduce relational count functions which generalize the classical notion of state and action visitation counts. We provide guarantees on the exploration efficiency of our framework using count functions under the assumption that we had a relational KWIK learner and a near-optimal planner. We propose a concrete exploration algorithm which integrates a practically efficient probabilistic rule learner and a relational planner (for which there are no guarantees, however) and employs the contexts of learned relational rules as features to model the novelty of states and actions. Our results in noisy 3D simulated robot manipulation problems and in domains of the international planning competition demonstrate that our approach is more effective than existing propositional and factored exploration techniques.

查看译文

关键词

relational rule,relational extension,exploration efficiency,concrete exploration algorithm,relational kwik learner,large stochastic relational domain,relational planner,efficient exploration,relational setting,relational count function,model-based reinforcement learning,robotics,reinforcement learning,statistical relational learning,exploration

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要