Back to Search
Start Over
SINet: Improving relational features in two-stage referring expression comprehension.
- Source :
-
Expert Systems with Applications . Oct2024, Vol. 251, pN.PAG-N.PAG. 1p. - Publication Year :
- 2024
-
Abstract
- Referring expression comprehension (REC) requires locating the region referred by the expression, where one of the key challenges is to distinguish the correct object from other of the same category using the described relationships. Existing two-stage methods explicitly establish the visual relationships among objects based on spatial information, including the location and scale. This paper investigates the role of relational features. We find that the predicted result becomes incorrect when the region scale changes. The trained model statistically tends to predict larger regions as the results and performs worse for objects of smaller scales. To alleviate this problem, we propose a Scale-Insensitive Network (SINet) to improve the robustness to scale information during the visual relational feature modeling process. Specifically, a category-wise random pooling module is designed to efficiently change object scales, and SINet simultaneously takes the original and resized regions as inputs. We introduce a consistency loss to train the model to remain correct under different scales. Our method can be integrated to existing two-stage methods for alleviating the dependence on scale information and promoting their utilization of key visual features. Extensive experimental results on 3 commonly used datasets, including RefCOCO, RefCOCO+ and RefCOCOg, have demonstrated the superiority of SINet to the state-of-the-art two-stage methods in terms of REC accuracy. • We are the first to consider the effect of region scale on the REC performance. • The proposed scale-insensitive network can be integrated into any two-stage methods. • We have conducted extensive qualitative and quantitative experiments on widely used datasets. [ABSTRACT FROM AUTHOR]
- Subjects :
- *FORECASTING
Subjects
Details
- Language :
- English
- ISSN :
- 09574174
- Volume :
- 251
- Database :
- Academic Search Index
- Journal :
- Expert Systems with Applications
- Publication Type :
- Academic Journal
- Accession number :
- 177514241
- Full Text :
- https://doi.org/10.1016/j.eswa.2024.123794