Back to Search Start Over

Data-driven adaptive dynamic programming for partially observable nonzero-sum games via Q-learning method.

Authors :
Wang, Wei
Chen, Xin
Fu, Hao
Wu, Min
Source :
International Journal of Systems Science; May2019, Vol. 50 Issue 7, p1338-1352, 15p
Publication Year :
2019

Abstract

This paper concerns with a class of discrete-time linear nonzero-sum games with the partially observable system state. As is known, the optimal control policy for the nonzero-sum games relies on the full state measurement which is hard to fulfil in partially observable environment. Moreover, to achieve the optimal control, one needs to know the accurate system model. To overcome these deficiencies, this paper develops a data-driven adaptive dynamic programming method via Q-learning method using measurable input/output data without any system knowledge. First, the representation of the unmeasurable inner system state is built using historical input/output data. Then, based on the representation state, a Q-function-based policy iteration approach with convergence analysis is introduced to approximate the optimal control policy iteratively. A neural network (NN)-based actor-critic framework is applied to implement the developed data-driven approach. Finally, two simulation examples are provided to demonstrate the effectiveness of the developed approach. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00207721
Volume :
50
Issue :
7
Database :
Complementary Index
Journal :
International Journal of Systems Science
Publication Type :
Academic Journal
Accession number :
136782386
Full Text :
https://doi.org/10.1080/00207721.2019.1599463