Back to Search
Start Over
Swin transformer-based traffic video text tracking.
- Source :
- Applied Intelligence; Nov2024, Vol. 54 Issue 21, p10581-10595, 15p
- Publication Year :
- 2024
-
Abstract
- Intelligent systems, such as driving assistance systems, can assist drivers by providing basic traffic, road blockage and possible route information to enable safe driving. The goal of scene text tracking in driver assistance systems is to locate and track scene text, milestone signs, traffic panels and road signs in real time. Therefore, the accuracy and real-time performance of scene text localization tracking play vital roles in intelligent driving assistance systems. However, traffic video text tracking often has the problems of missed and false detections because of illumination occlusion and similar appearances. In this paper, we propose a new Swin transformer-based traffic video text tracking method, known as STVT, which is composed of a Siamese SwinDC transformer module, a deformable text detection module, and a text matching module. The STVT method employs the Siamese SwinDC transformer module, which performs text detection by considering both temporal and spatial dimensions, mitigating the issue of missed detections caused by occlusion. The text matching module combines the semantic, visual, and geometric features of text instances to effectively differentiate visually similar text instances. Extensive experiments demonstrated that our proposed STVT method outperformed the state-of-the-art methods on various benchmark datasets. On the ICDAR2015 dataset, compared with those of the Free method, the mostly matched (MM) result increased by 32.0% (702 vs. 926), and the mostly lost (ML) result decreased by 33.2% (568 vs. 850). The visualization results demonstrated that the proposed STVT model can accurately detect and track occluded text instances in traffic videos. On the ICDAR2023 dataset, our method achieved a 6.01% improvement in MOTA compared to that of the TransDETR method, demonstrating that our proposed method is effective for small and dense text detection problems. In addition, qualitative and quantitative analyses confirmed the effectiveness and real-time performance of our proposed STVT method. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 0924669X
- Volume :
- 54
- Issue :
- 21
- Database :
- Complementary Index
- Journal :
- Applied Intelligence
- Publication Type :
- Academic Journal
- Accession number :
- 179690808
- Full Text :
- https://doi.org/10.1007/s10489-024-05710-9