1. An approach for applying natural language processing to image classification problems.
- Author
-
Astolfi, Gilberto, Sant'Ana, Diego André, Porto, João Vitor de Andrade, Rezende, Fábio Prestes Cesar, Tetila, Everton Castelão, Matsubara, Edson Takashi, and Pistori, Hemerson
- Subjects
- *
NATURAL language processing , *DEEP learning , *COMPUTER vision , *RECURRENT neural networks , *SOURCE code , *COMPUTER simulation - Abstract
A growing interest in applying Natural Language Processing (NLP) models to computer vision problems has recently emerged. This interest is motivated by the success of NLP models in tasks such as translation and text summarization. In this paper, we propose a new method for applying NLP to image classification problems. We aim to represent the visual patterns of objects by using a sequence of alphabet symbols and then train a Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), or Transformer using these sequences to classify objects. An extensive experimental evaluation using a limited number of images for training has been conducted to compare our method with the ResNet-50 deep learning architecture. The results obtained by the proposed method outperform ResNet-50 in all test scenarios. In one test, the method achieved an average accuracy of 95.3% compared to 89.9% of ResNet-50. The source code (http://git.inovisao.ucdb.br/inovisao/applying-npl-to-image-classification) and dataset (https://doi.org/10.6084/m9.figshare.20055602.v1) are publicly available. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF