1. Potential-based multiobjective reinforcement learning approaches to low-impact agents for AI safety.
- Author
-
Vamplew, Peter, Foale, Cameron, Dazeley, Richard, and Bignold, Adam
- Subjects
- *
REINFORCEMENT learning , *ARTIFICIAL intelligence , *NONLINEAR operators - Abstract
The concept of impact-minimisation has previously been proposed as an approach to addressing the safety concerns that can arise from utility-maximising agents. An impact-minimising agent takes into account the potential impact of its actions on the state of the environment when selecting actions, so as to avoid unacceptable side-effects. This paper proposes and empirically evaluates an implementation of impact-minimisation within the framework of multiobjective reinforcement learning. The key contributions are a novel potential-based approach to specifying a measure of impact, and an examination of a variety of non-linear action-selection operators so as to achieve an acceptable trade-off between achieving the agent's primary task and minimising environmental impact. These experiments also highlight a previously unreported issue with noisy estimates for multiobjective agents using non-linear action-selection, which has broader implications for the application of multiobjective reinforcement learning. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF