Back to Search Start Over

Table classification using both structure and content information: A case study of financial documents

Authors :
Quanzhi Li
Rui Fang
Sameena Shah
Source :
IEEE BigData
Publication Year :
2016
Publisher :
IEEE, 2016.

Abstract

Tables are significant document components. Table extraction and classification are critical for us to explore, retrieve and mine knowledge encoded in tables. This paper presents a learning based approach for classifying tables based on their content and structural information, with focus on financial document tables. To the best of our knowledge, this is the first study on classifying tables in financial domain, and also the first study of table classification based on its semantics, a more fine-grained level than previous studies. The experimental results show that it can effectively classify financial tables. We also analyzed what features are important and how to generate them. The feature identification and generation approach can potentially apply to other domains.

Details

Database :
OpenAIRE
Journal :
2016 IEEE International Conference on Big Data (Big Data)
Accession number :
edsair.doi...........aa3282b04edacb011181bca9ee5792e5