Back to Search
Start Over
Traffic navigation via reinforcement learning with episodic-guided prioritized experience replay.
- Source :
-
Engineering Applications of Artificial Intelligence . Nov2024:Part A, Vol. 137, pN.PAG-N.PAG. 1p. - Publication Year :
- 2024
-
Abstract
- Deep Reinforcement Learning (DRL) models play a fundamental role in autonomous driving applications; however, they typically suffer from sample inefficiency because they often require many interactions with the environment to learn effective policies. This makes the training process time-consuming. To address this shortcoming, Prioritized Experience Replay (PER) has proven to be effective by prioritizing samples with high Temporal-Difference (TD) error for learning. In this context, this study contributes to artificial intelligence by proposing a sample-efficient DRL algorithm called Episodic-Guided Prioritized Experience Replay (EPER). The core innovation of EPER lies in the utilization of an episodic memory, dedicated to storing successful training episodes. Within this memory, expected returns for each state–action pair are extracted. These returns, combined with TD error-based prioritization, form a novel objective function for deep Q-network training. To prevent excessive determinism, EPER introduces exploration into the learning process by incorporating a regularization term into the objective function that allows exploration of state-space regions with diverse Q-values. The proposed EPER algorithm is suitable to train a DRL agent for handling episodic tasks, and it can be integrated into off-policy DRL models. EPER is employed for traffic navigation through scenarios such as highway driving, merging, roundabout, and intersection to showcase its application in engineering. The attained results denote that, compared with the PER and an additional state-of-the-art training technique, EPER is superior in expediting the training of the agent and learning a more optimal policy that leads to lower collision rates within the constructed navigation scenarios. • Proposed Episodic-Guided Prioritized Experience Replay algorithm. • Proposed to integrate episodic information in deep Q-Network training. • Regularization of the DQN loss function for enhanced performance. • Improving exploration–exploitation trade-off management. • Enhancing vehicle autonomy with Episodic-Guided Prioritized Experience Replay. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 09521976
- Volume :
- 137
- Database :
- Academic Search Index
- Journal :
- Engineering Applications of Artificial Intelligence
- Publication Type :
- Academic Journal
- Accession number :
- 179632152
- Full Text :
- https://doi.org/10.1016/j.engappai.2024.109147