Start Over

Contextual reinforcement learning for supply chain management.

Authors :: Batsis, Alex
Samothrakis, Spyridon
Source :: Expert Systems with Applications. Sep2024:Part A, Vol. 249, pN.PAG-N.PAG. 1p.
Publication Year :: 2024
Abstract: Efficient generalisation in supply chain inventory management is challenging due to a potential mismatch between the model optimised and objective reality. It is hard to know how the real world is configured and, thus, hard to train an agent optimally for it. We address this problem by combining offline training and online adaptation. Agents were trained offline using data from all possible environmental configurations, termed contexts. During an online adaptation phase, agents search for the context maximising rewards. Agents adapted online rapidly and achieved performance close to knowing the context a-priori. In particular, they acted optimally without inferring the correct context, but by finding a suitable one for reward maximisation. By enabling agents to leverage off-line training and online adaptation, we improve their efficiency and effectiveness in unknown environments. The methodology has broader potential applications and contributes to making RL algorithms useful in practical scenarios. We have released the code for this paper under https://github.com/abatsis/supply_chain_few_shot_RL. • Agents are usually trained in simulations, but tested in the real world. • We use online adaptation to close the gap between simulation and the real world. • This problem is prevalent in real-world application of AI. • We test our methods in supply chain optimisation. [ABSTRACT FROM AUTHOR]