1. DeepDoT: Deep Framework for Detection of Tables in Document Images
- Author
-
Mandhatya Singh and Puneet Goyal
- Subjects
Scheme (programming language) ,Computer science ,business.industry ,05 social sciences ,Detector ,Process (computing) ,010501 environmental sciences ,Resolution (logic) ,computer.software_genre ,01 natural sciences ,Task (computing) ,End-to-end principle ,0502 economics and business ,Benchmark (computing) ,Table (database) ,Data mining ,Artificial intelligence ,050207 economics ,business ,computer ,0105 earth and related environmental sciences ,computer.programming_language - Abstract
An efficient table detection process offers a solution for enterprises dealing with automated analysis of digital documents. Table detection is a challenging task due to low inter-class and high intra-class dissimilarities in document images. Further, the foreground-background class imbalance problem limits the performance of table detectors (especially single stage table detectors). The existing table detectors rely on a bottom-up scheme that efficiently captures the semantic features but fails in accounting for the resolution enriched features, thus, affecting the overall detection performance. We propose an end to end trainable framework (DeepDoT), which effectively detect the tables (of different sizes) over arbitrary scales in document images. The DeepDoT utilizes a top-down as well as a bottom-up approach, and additionally, it uses focal loss for handling the pervasive class imbalance problem for accurate predictions. We consider multiple benchmark datasets: ICDAR-2013, UNLV, ICDAR-2017 POD, and MARMOT for a thorough evaluation. The proposed approach yields comparatively better performance in terms of F1-score as compared to state-of-the-art table detection approaches.
- Published
- 2021
- Full Text
- View/download PDF