Back to Search Start Over

You should know more: Learning external knowledge for visual dialog.

Authors :
Zhao, Lei
Zhang, Haonan
Li, Xiangpeng
Yang, Sen
Song, Yuanfeng
Source :
Neurocomputing. Jun2022, Vol. 488, p54-65. 12p.
Publication Year :
2022

Abstract

Visual dialog is a task that two agents complete a multi-round conversation based on an image, a caption, and dialog histories. Despite the recent progress, existing methods still undergo degradation on the condition of complex scenarios. Handling these scenarios depends on logical reasoning that requires commonsense priors. In this paper, we propose a novel visual dialog pipeline named Structured Knowledge-Aware Network (SKANet), consisting of an Image Knowledge-Aware Module and a Caption Knowledge-Aware Module. Specifically, the Image and Caption Knowledge-Aware Modules construct commonsense knowledge graphs from ConceptNet. We apply SKANet to two sub-tasks: the conventional visual dialog and a goal-oriented visual dialog named 'image guessing'. For the conventional visual dialog, the SKANet is combined with an additional Multi-Modality Fusion Module, which is designed to explore the visual content and the textual context about the dialog history. For the goal-oriented visual dialog, we directly apply the Image and Caption Knowledge-Aware Modules to two agents, respectively. Experimental results on VisDial v0.9 and VisDial v1.0 datasets show that our proposed method effectively outperforms comparative methods on both sub-tasks. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09252312
Volume :
488
Database :
Academic Search Index
Journal :
Neurocomputing
Publication Type :
Academic Journal
Accession number :
156253083
Full Text :
https://doi.org/10.1016/j.neucom.2021.10.121