Back to Search Start Over

Counterfactual Cycle-Consistent Learning for Instruction Following and Generation in Vision-Language Navigation

Authors :
Wang, Hanqing
Liang, Wei
Shen, Jianbing
Van Gool, Luc
Wang, Wenguan
Publication Year :
2022

Abstract

Since the rise of vision-language navigation (VLN), great progress has been made in instruction following -- building a follower to navigate environments under the guidance of instructions. However, far less attention has been paid to the inverse task: instruction generation -- learning a speaker~to generate grounded descriptions for navigation routes. Existing VLN methods train a speaker independently and often treat it as a data augmentation tool to strengthen the follower while ignoring rich cross-task relations. Here we describe an approach that learns the two tasks simultaneously and exploits their intrinsic correlations to boost the training of each: the follower judges whether the speaker-created instruction explains the original navigation route correctly, and vice versa. Without the need of aligned instruction-path pairs, such cycle-consistent learning scheme is complementary to task-specific training targets defined on labeled data, and can also be applied over unlabeled paths (sampled without paired instructions). Another agent, called~creator is added to generate counterfactual environments. It greatly changes current scenes yet leaves novel items -- which are vital for the execution of original instructions -- unchanged. Thus more informative training scenes are synthesized and the three agents compose a powerful VLN learning system. Extensive experiments on a standard benchmark show that our approach improves the performance of various follower models and produces accurate navigation instructions.<br />Comment: Accepted to CVPR 2022

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2203.16586
Document Type :
Working Paper