1. Finite-Time Analysis of Asynchronous Q-Learning Under Diminishing Step-Size From Control-Theoretic View
- Author
-
Han-Dong Lim and Donghwan Lee
- Subjects
Reinforcement learning ,Q-learning ,convergence analysis ,switching system ,control theory ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Q-learning has long been one of the most popular reinforcement learning algorithms, and theoretical analysis of Q-learning has been an active research topic for decades. Although researches on asymptotic convergence analysis of Q-learning have a long tradition, non-asymptotic convergence has only recently come under active study. The main goal of this paper is to investigate a new finite-time analysis of asynchronous Q-learning under Markovian observation models via a control system viewpoint. In particular, we introduce a discrete-time time-varying switching system model of Q-learning with diminishing step-sizes for our analysis, which significantly improves recent development of the switching system analysis with constant step-sizes, and leads to $\mathcal {O}(\sqrt {{\log k}/{k}})$ convergence rate that is comparable to or better than most of the state of the art results in the literature. In the meanwhile, we consider the continuous-time Lyapunov equation to avoid the difficulty in the analysis posed by using diminishing step-sizes in discrete-time. The proposed analysis brings in additional insights, covers different scenarios, and provides new simplified templates for analysis to deepen our understanding on Q-learning via its unique connection to discrete-time switching systems.
- Published
- 2024
- Full Text
- View/download PDF