Back to Search Start Over

Off-policy inverse Q-learning for discrete-time antagonistic unknown systems.

Authors :
Lian, Bosen
Xue, Wenqian
Xie, Yijing
Lewis, Frank L.
Davoudi, Ali
Source :
Automatica. Sep2023, Vol. 155, pN.PAG-N.PAG. 1p.
Publication Year :
2023

Abstract

This paper proposes a data-driven model-free inverse reinforcement learning (RL) algorithm to reconstruct the unknown cost function of the demonstrated discrete-time (DT) dynamical systems with antagonistic disturbances. We propose an inverse RL policy iteration scheme that uses system dynamics and the input policies, for deriving our main result of a data-driven off-policy inverse Q-learning algorithm using only demonstrated trajectories of the antagonistic system without knowing system dynamics and the control policy gain. This data-driven algorithm consists of Q -function evaluation, state-penalty weight improvement, and action policies update. We guarantee unbiased estimates in the data-driven algorithm when exploration noises exist for the persistence of the excitation. An example verifies the proposed algorithm. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00051098
Volume :
155
Database :
Academic Search Index
Journal :
Automatica
Publication Type :
Academic Journal
Accession number :
166740697
Full Text :
https://doi.org/10.1016/j.automatica.2023.111171