Back to Search Start Over

Reinforcement learning method for the multi-objective speed trajectory optimization of a freight train.

Authors :
Lin, Xuan
Liang, Zhicheng
Shen, Lijuan
Zhao, Fengyuan
Liu, Xinyu
Sun, Pengfei
Cao, Taiqiang
Source :
Control Engineering Practice. Sep2023, Vol. 138, pN.PAG-N.PAG. 1p.
Publication Year :
2023

Abstract

With the rising urge to mitigate the green house effect, reducing the energy consumption of the freight train attracts much attention. Multiple constraints should be taken into account to solve the energy-efficient control problem, which can be reformulated as the multi-objective optimization. This paper proposes a Reinforcement Learning (RL) method for the multi-objective speed trajectory optimization with the goal of the energy-efficiency, punctuality and accurate parking simultaneously. Since the solution space for the optimization problem in this paper is large and discrete, a Gated Recurrent Unit (GRU)-based network is proposed to achieve the fast approximation of the optimal value function instead of the lookup Q-table. Meanwhile, a new architecture, including the embedding matrix, is used to model the control sequence that generates the speed trajectory. Besides, this paper constructs a Deep Q-Network (DQN) framework to train the GRU network without relying on the prior knowledge of the freight train model. Finally, the Intelligent Train Operation (ITO) algorithm is proposed and verified, using the data of Beijing–Guangzhou Railway Line and HXD1B electric locomotive. The case studies indicate that the reward function for the ITO algorithm converges rapidly and the energy consumption monotonically decreases with the trip time, which satisfies the multiple optimization objectives. In terms of saving the energy consumption, the ITO algorithm performs better than Fuzzy Predictive Control (FPC), Genetic Algorithm (GA) and the field test data. The computation time of different speed trajectories demonstrates that the ITO algorithm is applicable to generating the optimal speed trajectory off-line. • A novel multi-objective optimization model based on the reinforcement learning is proposed to search for the optimal control sequence. The embedding matrix is used to formulate the control sequence. • An intelligent train operation algorithm based on the gated recurrent unit network is proposed to compute the optimal speed trajectory without the precise train dynamics or the prior knowledge from the experienced drivers. • The energy consumption is reduced by 3.17%, 0.71% and 6.77% respectively comparing with fuzzy predictive control, genetic algorithm and field test data. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09670661
Volume :
138
Database :
Academic Search Index
Journal :
Control Engineering Practice
Publication Type :
Academic Journal
Accession number :
166742060
Full Text :
https://doi.org/10.1016/j.conengprac.2023.105605