The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise
CoRR(2024)
摘要
Stochastic approximation is a class of algorithms that update a vector
iteratively, incrementally, and stochastically, including, e.g., stochastic
gradient descent and temporal difference learning. One fundamental challenge in
analyzing a stochastic approximation algorithm is to establish its stability,
i.e., to show that the stochastic vector iterates are bounded almost surely. In
this paper, we extend the celebrated Borkar-Meyn theorem for stability from the
Martingale difference noise setting to the Markovian noise setting, which
greatly improves its applicability in reinforcement learning, especially in
those off-policy reinforcement learning algorithms with linear function
approximation and eligibility traces. Central to our analysis is the
diminishing asymptotic rate of change of a few functions, which is implied by
both a form of strong law of large numbers and a commonly used V4 Lyapunov
drift condition and trivially holds if the Markov chain is finite and
irreducible.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要