Stochastic Approximation with Delayed Updates: Finite-Time Rates under Markovian Sampling
CoRR(2024)
摘要
Motivated by applications in large-scale and multi-agent reinforcement
learning, we study the non-asymptotic performance of stochastic approximation
(SA) schemes with delayed updates under Markovian sampling. While the effect of
delays has been extensively studied for optimization, the manner in which they
interact with the underlying Markov process to shape the finite-time
performance of SA remains poorly understood. In this context, our first main
contribution is to show that under time-varying bounded delays, the delayed SA
update rule guarantees exponentially fast convergence of the last
iterate to a ball around the SA operator's fixed point. Notably, our bound is
tight in its dependence on both the maximum delay τ_max, and the
mixing time τ_mix. To achieve this tight bound, we develop a novel
inductive proof technique that, unlike various existing delayed-optimization
analyses, relies on establishing uniform boundedness of the iterates. As such,
our proof may be of independent interest. Next, to mitigate the impact of the
maximum delay on the convergence rate, we provide the first finite-time
analysis of a delay-adaptive SA scheme under Markovian sampling. In particular,
we show that the exponent of convergence of this scheme gets scaled down by
τ_avg, as opposed to τ_max for the vanilla delayed SA rule; here,
τ_avg denotes the average delay across all iterations. Moreover, the
adaptive scheme requires no prior knowledge of the delay sequence for step-size
tuning. Our theoretical findings shed light on the finite-time effects of
delays for a broad class of algorithms, including TD learning, Q-learning, and
stochastic gradient descent under Markovian sampling.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要