Back to Search Start Over

SINet: Improving relational features in two-stage referring expression comprehension.

Authors :
Guo, Wenya
Zhang, Ying
Yuan, Xiaojie
Source :
Expert Systems with Applications. Oct2024, Vol. 251, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

Referring expression comprehension (REC) requires locating the region referred by the expression, where one of the key challenges is to distinguish the correct object from other of the same category using the described relationships. Existing two-stage methods explicitly establish the visual relationships among objects based on spatial information, including the location and scale. This paper investigates the role of relational features. We find that the predicted result becomes incorrect when the region scale changes. The trained model statistically tends to predict larger regions as the results and performs worse for objects of smaller scales. To alleviate this problem, we propose a Scale-Insensitive Network (SINet) to improve the robustness to scale information during the visual relational feature modeling process. Specifically, a category-wise random pooling module is designed to efficiently change object scales, and SINet simultaneously takes the original and resized regions as inputs. We introduce a consistency loss to train the model to remain correct under different scales. Our method can be integrated to existing two-stage methods for alleviating the dependence on scale information and promoting their utilization of key visual features. Extensive experimental results on 3 commonly used datasets, including RefCOCO, RefCOCO+ and RefCOCOg, have demonstrated the superiority of SINet to the state-of-the-art two-stage methods in terms of REC accuracy. • We are the first to consider the effect of region scale on the REC performance. • The proposed scale-insensitive network can be integrated into any two-stage methods. • We have conducted extensive qualitative and quantitative experiments on widely used datasets. [ABSTRACT FROM AUTHOR]

Subjects

Subjects :
*FORECASTING

Details

Language :
English
ISSN :
09574174
Volume :
251
Database :
Academic Search Index
Journal :
Expert Systems with Applications
Publication Type :
Academic Journal
Accession number :
177514241
Full Text :
https://doi.org/10.1016/j.eswa.2024.123794