Back to Search Start Over

Intelligent Part of Speech tagger for Hindi.

Authors :
Dutta, Devashish
Halder, Subhanu
Gayen, Tirthankar
Source :
Procedia Computer Science; 2023, Vol. 218, p604-611, 8p
Publication Year :
2023

Abstract

English Part of Speech like noun, verb, adverb, adjective, pronoun, preposition, interjection, conjunction is somewhat similar in Hindi but not exactly the same. Hindi grammar has different Part of Speech (POS) based on its morphological features and the occurrence of a word/lexeme in a sentence. The existing techniques used in English language for POS tagging may not work properly for Indian language like Hindi. It is because the grammatical structure of the relatively free word order language like Hindi differs from English. Stochastic taggers may not give good performance as morphological information is not taken into account. The available Hindi word corpora usually have less frequency for individual tags. As a result, a larger size corpus having diversity in the type of sentences can provide better results. But, even after using smoothing techniques most these taggers fail to provide correct results in the presence of unknown words. Considering these aspects, this paper proposes an Intelligent POS tagger for Hindi language based on VITERBI and K-Nearest Neighbour, capable of providing more accurate results than VITERBI in the presence of unknown words. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
18770509
Volume :
218
Database :
Supplemental Index
Journal :
Procedia Computer Science
Publication Type :
Academic Journal
Accession number :
161583819
Full Text :
https://doi.org/10.1016/j.procs.2023.01.042