Back to Search Start Over

Deep Reinforcement Learning for Crowdsourced Urban Delivery.

Authors :
Ahamed, Tanvir
Zou, Bo
Farazi, Nahid Parvez
Tulabandhula, Theja
Source :
Transportation Research Part B: Methodological. Oct2021, Vol. 152, p227-257. 31p.
Publication Year :
2021

Abstract

• Investigate assigning shipping requests to crowdsourcees with time and capacity constraints • Propose a centralized, deep reinforcement learning-based approach • Present new state space representation encompassing spatial-temporal and capacity information • Embed heuristics-guided action choice in DRL to preserve tractability and enhance efficiency • Integrate rule-interposing into DRL to further enhance training and implementation efficiency This paper investigates the problem of assigning shipping requests to ad hoc couriers in the context of crowdsourced urban delivery. The shipping requests are spatially distributed each with a limited time window between the earliest time for pickup and latest time for delivery. The ad hoc couriers, termed crowdsourcees, also have limited time availability and carrying capacity. We propose a new deep reinforcement learning (DRL)-based approach to tackling this assignment problem. A deep Q network (DQN) algorithm is trained which entails two salient features of experience replay and target network that enhance the efficiency, convergence, and stability of DRL training. More importantly, this paper makes three methodological contributions: 1) presenting a comprehensive and novel characterization of crowdshipping system states that encompasses spatial-temporal and capacity information of crowdsourcees and requests; 2) embedding heuristics that leverage information offered by the state representation and are based on intuitive reasonings to guide specific actions to take, to preserve tractability and enhance efficiency of training; and 3) integrating rule-interposing to prevent repeated visiting of the same routes and node sequences during routing improvement, thereby further enhancing the training efficiency by accelerating learning. The computational complexities of the heuristics and the overall DQN training are investigated. The effectiveness of the proposed approach is demonstrated through extensive numerical analysis. The results show the benefits brought by the heuristics-guided action choice, rule-interposing, and having time-related information in the state space in DRL training, the near-optimality of the solutions obtained, and the superiority of the proposed approach over existing methods in terms of solution quality, computation time, and scalability. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01912615
Volume :
152
Database :
Academic Search Index
Journal :
Transportation Research Part B: Methodological
Publication Type :
Academic Journal
Accession number :
152924904
Full Text :
https://doi.org/10.1016/j.trb.2021.08.015