Policy-Iteration-Based Learning for Nonlinear Player Game Systems With Constrained Inputs.

Authors :: Mu, Chaoxu
Wang, Ke
Sun, Changyin
Source :: IEEE Transactions on Systems, Man & Cybernetics. Systems. Oct2021, Vol. 51 Issue 10, p6488-6502. 15p.
Publication Year :: 2021
Abstract: This article investigates the optimal control problem for nonlinear nonzero-sum differential game in the environment of no initial admissible policies while considering the control constraint. An adaptive learning algorithm is thus developed based on policy iteration technique to approximately obtain the Nash equilibrium using real-time data. A two-player continuous-time system is used to present this approximate mechanism, which is implemented as a critic–actor architecture for every player. The constraint is incorporated into this optimization by introducing the nonquadratic value function, and the associated constrained Hamilton–Jacobi equation is derived. The critic neural network (NN) and actor NN are utilized to learn the value function and the optimal control policy, respectively, in the light of novel weight tuning laws. In order to tackle the stability during the learning phase, two stable operators are designed for two actors. The proposed algorithm is proved to be convergent as a Newton’s iteration, and the stability of this closed-loop system is also ensured by Lyapunov analysis. Finally, two simulation examples demonstrate the effectiveness of the proposed learning scheme by considering different constraint scenes. [ABSTRACT FROM AUTHOR]

Subjects :: *HAMILTON-Jacobi equations
*DIFFERENTIAL games
*NASH equilibrium
*ALGORITHMS
*MACHINE learning

Full Text Access

Tools