Back to Search
Start Over
ALSA: Adversarial Learning of Supervised Attentions for Visual Question Answering.
- Source :
- IEEE Transactions on Cybernetics; Jun2022, Vol. 52 Issue 6, p4520-4533, 14p
- Publication Year :
- 2022
-
Abstract
- Visual question answering (VQA) has gained increasing attention in both natural language processing and computer vision. The attention mechanism plays a crucial role in relating the question to meaningful image regions for answer inference. However, most existing VQA methods: 1) learn the attention distribution either from free-form regions or detection boxes in the image, which is intractable in answering questions about the foreground object and background form, respectively and 2) neglect the prior knowledge of human attention and learn the attention distribution with an unguided strategy. To fully exploit the advantages of attention, the learned attention distribution should focus more on the question-related image regions, such as human attention for both the questions, about the foreground object and background form. To achieve this, this article proposes a novel VQA model, called adversarial learning of supervised attentions (ALSAs). Specifically, two supervised attention modules: 1) free form-based and 2) detection-based, are designed to exploit the prior knowledge for attention distribution learning. To effectively learn the correlations between the question and image from different views, that is, free-form regions and detection boxes, an adversarial learning mechanism is implemented as an interplay between two supervised attention modules. The adversarial learning reinforces the two attention modules mutually to make the learned multiview features more effective for answer inference. The experiments performed on three commonly used VQA datasets confirm the favorable performance of ALSA. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 21682267
- Volume :
- 52
- Issue :
- 6
- Database :
- Complementary Index
- Journal :
- IEEE Transactions on Cybernetics
- Publication Type :
- Academic Journal
- Accession number :
- 157551608
- Full Text :
- https://doi.org/10.1109/TCYB.2020.3029423