Back to Search Start Over

An Efficient Transformer–CNN Network for Document Image Binarization.

Authors :
Zhang, Lina
Wang, Kaiyuan
Wan, Yi
Source :
Electronics (2079-9292); Jun2024, Vol. 13 Issue 12, p2243, 28p
Publication Year :
2024

Abstract

Color image binarization plays a pivotal role in image preprocessing work and significantly impacts subsequent tasks, particularly for text recognition. This paper concentrates on document image binarization (DIB), which aims to separate an image into a foreground (text) and background (non-text content). We thoroughly analyze conventional and deep-learning-based approaches and conclude that prevailing DIB methods leverage deep learning technology. Furthermore, we explore the receptive fields of pre- and post-network training to underscore the Transformer model's advantages. Subsequently, we introduce a lightweight model based on the U-Net structure and enhanced with the MobileViT module to capture global information features in document images better. Given its adeptness at learning both local and global features, our proposed model demonstrates competitive performance on two standard datasets (DIBCO2012 and DIBCO2017) and good robustness on the DIBCO2019 dataset. Notably, our proposed method presents a straightforward end-to-end model devoid of additional image preprocessing or post-processing, eschewing the use of ensemble models. Moreover, its parameter count is less than one-eighth of the model, which achieves the best results on most DIBCO datasets. Finally, two sets of ablation experiments are conducted to verify the effectiveness of the proposed binarization model. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
20799292
Volume :
13
Issue :
12
Database :
Complementary Index
Journal :
Electronics (2079-9292)
Publication Type :
Academic Journal
Accession number :
178154479
Full Text :
https://doi.org/10.3390/electronics13122243