1. Accelerating Reinforcement Learning using EEG-based implicit human feedback
- Author
-
Raghupathy Sivakumar, Mohit Agarwal, Faramarz Fekri, Ekansh Gupta, and Duo Xu
- Subjects
0209 industrial biotechnology ,Observer (quantum physics) ,medicine.diagnostic_test ,Human intelligence ,Computer science ,Process (engineering) ,business.industry ,Cognitive Neuroscience ,Interface (computing) ,02 engineering and technology ,Electroencephalography ,Computer Science Applications ,020901 industrial engineering & automation ,Artificial Intelligence ,Robustness (computer science) ,0202 electrical engineering, electronic engineering, information engineering ,medicine ,Human-in-the-loop ,Reinforcement learning ,020201 artificial intelligence & image processing ,Artificial intelligence ,business - Abstract
Providing Reinforcement Learning (RL) agents with human feedback can dramatically improve various aspects of learning. However, previous methods require human observer to give inputs explicitly (e.g., press buttons, voice interface), burdening the human in the loop of RL agent’s learning process. Further, providing explicit human advise (feedback) continuously is not always possible or too restrictive, e.g., autonomous driving, disabled rehabilitation, etc. In this work, we investigate capturing human’s intrinsic reactions as implicit (and natural) feedback through EEG in the form of error-related potentials (ErrP), providing a natural and direct way for humans to improve the RL agent learning. As such, the human intelligence can be integrated via implicit feedback with RL algorithms to accelerate the learning of RL agent. We develop three reasonably complex 2D discrete navigational games to experimentally evaluate the overall performance of the proposed work. And the motivation of using ErrPs as feedbacks is also verified by subjective experiments. Major contributions of our work are as follows, (i) we propose and experimentally validate the zero-shot learning of ErrPs, where the ErrPs can be learned for one game, and transferred to other unseen games, (ii) we propose a novel RL framework for integrating implicit human feedbacks via ErrPs with RL agent, improving the label efficiency and robustness to human mistakes, and (iii) compared to prior works, we scale the application of ErrPs to reasonably complex environments, and demonstrate the significance of our approach for accelerated learning through real user experiments.
- Published
- 2021
- Full Text
- View/download PDF