1. CDText: Scene text detector based on context-aware deformable transformer.
- Author
-
Wu, Yirui, Kong, Qiran, Yong, Lai, Narducci, Fabio, and Wan, Shaohua
- Subjects
- *
TEXT recognition , *DETECTORS , *FEATURE extraction , *COMPARATIVE method - Abstract
• CDText detect texts of arbitrary shapes by encoding context information. • Feature extractor refines feature map with dilated context encoding blocks. • Transformer aggregates text features of detection boxes for instance segmentation. Scene text detection task aims to precisely locate text regions in natural scenes. However, the existing methods still face challenges in detecting arbitrary-shaped text, due to their limited feature representation capability. To alleviate this problem, we propose a scene text detector, i.e., CDText, based on structure of context-aware deformable transformer. Specifically, CDText firstly adopts different convolution kernel designs for feature extraction, which designs receptive fields with different size for multi-scale feature perception and fusion. Meanwhile, multi-head self-attention mechanism is used to strengthen the reasoning ability of CDText in a global sense, thus enhancing feature maps with abundant context information by extracting implicit relationship between multi-scale text features. Moreover, CDText designs a segmentation head to segment text instances of arbitrary shapes from rectangular detection boxes. Experiments show that CDText is superior to comparative methods in detection accuracy, achieving F -scores of 92.7, 81.9, and 82.9 on ICDAR2013, Total Text, and CTW-1500 datasets, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF