1. PMST: A parallel and miniature Swin transformer for logo detection.
- Author
-
Li, Bowen, Zhang, Jianxun, Cao, Jie, Zhang, Jie, and Gao, Linfeng
- Subjects
- *
TRANSFORMER models , *COMPUTER vision , *IMAGE fusion , *FEATURE extraction , *OBJECT recognition (Computer vision) , *DEEP learning - Abstract
With the popularity of online shopping and the rise of various brands, logo detection is gradually coming to the forefront of researchers' minds. However, accurately detecting multiscale, similar, diverse and shape-shifting logos poses a challenge for this technology. The Swin transformer has created a milestone as a high-performance deep learning method across modalities, domains and tasks in various tasks of computer vision. We improve the Swin transformer for better image feature extraction and enhance the robustness for its application to logo detection. In this paper, we propose the PMST as the backbone of logo detection, and design a bypass-parallelizable shift module and a miniature window tandem shift strategy to further enhance image feature fusion and transfer between windows. The PMST achieves a 79.2 box AP on the LogoData dataset, surpassing most detection models. According to existing known methods, in logo detection, the PMST is the first backbone to put transformer to the detection. The results achieved state-of-the-art on the FlickrLogos-32 and FoodLogoDet-1500 datasets. It also achieves excellent efficiency on the ImageNet-1K, OpenBrands-80 and MS-COCO datasets. The code is available at https://github.com/blowhen/PMST. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF