Back to Search Start Over

Memory-Augmented Reinforcement Learning for Image-Goal Navigation

Authors :
Lina Mezghani
Sainbayar Sukhbaatar
Thibaut Lavril
Oleksandr Maksymets
Dhruv Batra
Piotr Bojanowski
Karteek Alahari
Apprentissage de modèles à partir de données massives (Thoth)
Inria Grenoble - Rhône-Alpes
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Jean Kuntzmann (LJK)
Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )
Université Grenoble Alpes (UGA)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )
Université Grenoble Alpes (UGA)
Meta AI
Georgia Institute of Technology [Atlanta]
ANR-18-CE23-0011
ANR-18-CE23-0011,AVENUE,Réseau de mémoire visuelle pour l'interprétation de scènes(2018)
Facebook AI Research [Paris] (FAIR)
Facebook
Facebook AI Research (FAIR)
Source :
IROS-IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS-IEEE/RSJ International Conference on Intelligent Robots and Systems, Oct 2022, Kyoto, Japan, HAL
Publication Year :
2022
Publisher :
HAL CCSD, 2022.

Abstract

In this work, we address the problem of image-goal navigation in the context of visually-realistic 3D environments. This task involves navigating to a location indicated by a target image in a previously unseen environment. Earlier attempts, including RL-based and SLAM-based approaches, have either shown poor generalization performance, or are heavily-reliant on pose/depth sensors. We present a novel method that leverages a cross-episode memory to learn to navigate. We first train a state-embedding network in a self-supervised fashion, and then use it to embed previously-visited states into a memory. In order to avoid overfitting, we propose to use data augmentation on the RGB input during training. We validate our approach through extensive evaluations, showing that our data-augmented memory-based model establishes a new state of the art on the image-goal navigation task in the challenging Gibson dataset. We obtain this competitive performance from RGB input only, without access to additional sensors such as position or depth.

Details

Language :
English
Database :
OpenAIRE
Journal :
IROS-IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS-IEEE/RSJ International Conference on Intelligent Robots and Systems, Oct 2022, Kyoto, Japan, HAL
Accession number :
edsair.doi.dedup.....addb3e98f000da27f8491a2fb9633b2e