201. Information extraction framework for Kurunthogai.
- Author
-
Subalalitha, C N
- Subjects
DATA mining ,NATURAL language processing ,BODIES of water ,EXPOSITION (Rhetoric) - Abstract
Kurunthogai is a classical Tamil poetic masterpiece and it is the second book of Ettuthokai which is one of the Sangam literary works. The poems of Kurunthogai expresses the love life between men and women who lived during the Sangam age. Kurunthogai is a massive work written by many authors. The poems are written based on the five different landscapes namely, Kurinchi, Mullai, Marutham, Neythal, and Pālai. So, the poems contain much valuable historical information related to these landscapes. This paper proposes a template-based Information Extraction (IE) framework for Kurunthogai which automatically extracts the names of flora, fauna, foods, vessels, and water bodies described in it. Furthermore, it extracts Noun Unigrams, Verb Unigrams, Adjective-Noun Bigrams, and Adverb-Verb Bigrams. Tamil Morphological Analyzer tool has been used to extract the N-grams. The state-of-art IE techniques have attempted to extract information from expository texts, whereas, the proposed IE framework extracts information from a literature-based text. The existing techniques extract information from monolingual texts, whereas, the proposed IE framework extracts information from bilingual texts. The proposed IE framework has achieved a precision of 88.8%. The proposed framework can be applied for any literature type of texts and be used in various applications of Natural Language Processing. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF