Back to Search
Start Over
End-to-End Handwritten Paragraph Text Recognition Using a Vertical Attention Network
- Source :
- IEEE Transactions on Pattern Analysis and Machine Intelligence. 45:508-524
- Publication Year :
- 2023
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2023.
-
Abstract
- Unconstrained handwritten text recognition remains challenging for computer vision systems. Paragraph text recognition is traditionally achieved by two models: the first one for line segmentation and the second one for text line recognition. We propose a unified end-to-end model using hybrid attention to tackle this task. This model is designed to iteratively process a paragraph image line by line. It can be split into three modules. An encoder generates feature maps from the whole paragraph image. Then, an attention module recurrently generates a vertical weighted mask enabling to focus on the current text line features. This way, it performs a kind of implicit line segmentation. For each text line features, a decoder module recognizes the character sequence associated, leading to the recognition of a whole paragraph. We achieve state-of-the-art character error rate at paragraph level on three popular datasets: 1.91% for RIMES, 4.45% for IAM and 3.59% for READ 2016. Our code and trained model weights are available at https://github.com/FactoDeepLearning/VerticalAttentionOCR.
- Subjects :
- FOS: Computer and information sciences
Computational Theory and Mathematics
Artificial Intelligence
Computer Vision and Pattern Recognition (cs.CV)
Applied Mathematics
ComputingMethodologies_DOCUMENTANDTEXTPROCESSING
Computer Science - Computer Vision and Pattern Recognition
0202 electrical engineering, electronic engineering, information engineering
[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]
020201 artificial intelligence & image processing
02 engineering and technology
Computer Vision and Pattern Recognition
Software
Subjects
Details
- ISSN :
- 19393539 and 01628828
- Volume :
- 45
- Database :
- OpenAIRE
- Journal :
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- Accession number :
- edsair.doi.dedup.....71ed0a9d832e3822bf7b7c764b58eb4c