Back to Search Start Over

Realtime multi-scale scene text detection with scale-based region proposal network.

Authors :
He, Wenhao
Zhang, Xu-Yao
Yin, Fei
Luo, Zhenbo
Ogier, Jean-Marc
Liu, Cheng-Lin
Source :
Pattern Recognition. Feb2020, Vol. 98, pN.PAG-N.PAG. 1p.
Publication Year :
2020

Abstract

• We propose a novel network named SRPN to realize both text/non-text localization and scale estimation efficiently. • A two-stage detection scheme based on SRPN is proposed to avoid using multi-scale pyramid input and achieve faster detection speed. • The proposed method achieves remarkable speedup on ICDAR2015, ICDAR2013 and MSRA-TD500 while keeping competitive performance. • Ablation experiments are given to prove reasonableness of the proposed method from different aspects. Multi-scale approaches have been widely used for achieving high accuracy for scene text detection, but they usually slow down the speed of the whole system. In this paper, we propose a two-stage framework for realtime multi-scale scene text detection. The first stage employs a novel S cale-based R egion P roposal N etwork (SRPN) which can localize text of wide scale range and estimate text scale efficiently. Based on SRPN, non-text regions are filtered out, and text region proposals are generated. Moreover, based on text scale estimation by SRPN, small or big texts in region proposals are resized into a unified normal scale range. The second stage then adopts a Fully Convolutional Network based scene text detector to localize text words from proposals of the first stage. Text detector in the second stage detects texts of narrow scale range but accurately. Since most non-text regions are eliminated through SRPN efficiently, and texts in proposals are properly scaled to avoid multi-scale pyramid processing, the whole system is quite fast. We evaluate both performance and speed of the proposed method on datasets ICDAR2015, ICDAR2013, and MSRA-TD500. On ICDAR2015, our system can reach the state-of-the-art F -measure score of 85.40% at 16.5 fps (frame per second), and competitive performance of 79.66% at 35.1 fps, either of which is more than 5 times faster than previous best methods. On ICDAR2013 and MSRA-TD500, we also achieve remarkable speedup by keeping competitive performance. Ablation experiments are also provided to demonstrate the reasonableness of our method. [ABSTRACT FROM AUTHOR]

Subjects

Subjects :
*DETECTORS

Details

Language :
English
ISSN :
00313203
Volume :
98
Database :
Academic Search Index
Journal :
Pattern Recognition
Publication Type :
Academic Journal
Accession number :
139407593
Full Text :
https://doi.org/10.1016/j.patcog.2019.107026