1. Navigating the non-compliance effects on system optimal route guidance using reinforcement learning.
- Author
-
Yun, Hyunsoo, Kim, Eui-jin, Ham, Seung Woo, and Kim, Dong-Kyu
- Subjects
- *
REINFORCEMENT learning , *TRAVEL time (Traffic engineering) , *ASSIGNMENT problems (Programming) , *NONCOMPLIANCE , *TRANSPORTATION management - Abstract
• Scenario where vehicles adhere to or deviate from the system optimal route guidance. • Reinforcement learning approach to dynamic system optimal assignment problem. • Multi-agent reinforcement learning to assess the impact of non-compliance. • Exploration of the balance between system efficiency and individual user preferences. We consider a scenario where the transportation management center (TMC) guides future autonomous vehicles (AVs) toward optimal routes, aiming to bring the network in line with the system optimal (SO) principle. However, achieving this requires a joint decision-making process, while users may be non-compliant with the TMC's route guidance for personal gain. This paper models a future transportation network with a microscopic simulation, to introduce a novel concept of mixed equilibrium. In this framework, AVs follow the TMC's SO route guidance, while users can dynamically choose to either comply or manually override this autonomy based on their own judgment. We initially model a fully compliant scenario, where the centralized Q-network, analogous to a TMC, is trained using reinforcement learning (RL) to minimize total system travel time (TSTT), providing optimal routes to users. Subsequently, we extend the problem setting to a multi-agent reinforcement learning (MARL) scenario, where users can comply or deviate from the TMC's guidance based on their own decision-making. Through neural fictitious self-play (NFSP), we employ a modulating hyperparameter to investigate the impact of varying degrees of non-compliance on the overall system. Results indicate that our RL approach holds significant potential for addressing the dynamic system optimal assignment problem. Remarkably, the TMC's route guidance retains the essence of SO while integrating some level of non-compliance. However, we also demonstrate that dominant user-centric decision-making may lead to system inefficiencies while creating disparities among users. Our framework serves as an innovative tool in an AV-dominant future, offering a realistic perspective on network performance that aids in formulating effective traffic management strategies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF