Robust Reinforcement Learning For Stochastic Linear Quadratic Control With Multiplicative Noise

IFAC PAPERSONLINE(2021)

引用 2|浏览40
暂无评分
摘要
This paper studies the robustness of reinforcement learning for discrete-time linear stochastic systems with multiplicative noise evolving in continuous state and action spaces. As one of the popular methods in reinforcement learning, the robustness of policy iteration is a longstanding open issue for the stochastic linear quadratic regulator (LQR) problem with multiplicative noise. A solution in the spirit of small-disturbance input-to-state stability is given, guaranteeing that the solutions of the policy iteration algorithm are bounded and enter a small neighborhood of the optimal solution, whenever the error in each iteration is bounded and small. In addition, a novel off-policy multiple-trajectory optimistic least-squares policy iteration algorithm is proposed, to learn a near-optimal solution of the stochastic LQR problem directly from online input/state data, without explicitly identifying the system matrices. The efficacy of the proposed algorithm is supported by rigorous convergence analysis and numerical results on a second-order example. Copyright (C) 2021 The Authors.
更多
查看译文
关键词
Reinforcement learning control, Stochastic optimal control problems, Data-based control, Robustness analysis, Input-to-state stability
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要