Back to Search Start Over

Parameterized Batch Reinforcement Learning for Longitudinal Control of Autonomous Land Vehicles.

Authors :
Huang, Zhenhua
Xu, Xin
He, Haibo
Tan, Jun
Sun, Zhenping
Source :
IEEE Transactions on Systems, Man & Cybernetics. Systems. Apr2019, Vol. 49 Issue 4, p730-741. 12p.
Publication Year :
2019

Abstract

This paper presents a parameterized batch reinforcement learning algorithm for near-optimal longitudinal control of autonomous land vehicles (ALVs). The proposed approach uses an actor-critic architecture, where parameterized feature vectors based on kernels are learned from collected samples for approximating the value functions and policies. One difference between the parameterized batch actor-critic (PBAC) algorithm and previous actor-critic learning approaches is that the critic and actor in PBAC share the same linear features, which has been theoretically proved to be a beneficial property for the convergence of actor-critic learning approaches. In order to obtain better learning efficiency, least-squares-based batch updating rules are designed for the critic and actor, respectively. Based on the PBAC learning algorithm, a data-driven longitudinal control method is presented for ALVs to obtain near-optimal control policies which adaptively tune the fuel/brake control signals to track different speeds. A multiobjective reward function is designed so that both tracking precision and driving smoothness are considered. Extensive experiments were conducted on a real ALV platform while driving on flat, slippery, sloping, and bumpy roads. The experimental results illustrate the superiority of the PBAC-based self-learning controller over conventional longitudinal control methods such as proportional-integral (PI) control and learning-based PI control. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
21682216
Volume :
49
Issue :
4
Database :
Academic Search Index
Journal :
IEEE Transactions on Systems, Man & Cybernetics. Systems
Publication Type :
Academic Journal
Accession number :
135443174
Full Text :
https://doi.org/10.1109/TSMC.2017.2712561