Back to Search
Start Over
Intelligent Part of Speech tagger for Hindi.
- Source :
- Procedia Computer Science; 2023, Vol. 218, p604-611, 8p
- Publication Year :
- 2023
-
Abstract
- English Part of Speech like noun, verb, adverb, adjective, pronoun, preposition, interjection, conjunction is somewhat similar in Hindi but not exactly the same. Hindi grammar has different Part of Speech (POS) based on its morphological features and the occurrence of a word/lexeme in a sentence. The existing techniques used in English language for POS tagging may not work properly for Indian language like Hindi. It is because the grammatical structure of the relatively free word order language like Hindi differs from English. Stochastic taggers may not give good performance as morphological information is not taken into account. The available Hindi word corpora usually have less frequency for individual tags. As a result, a larger size corpus having diversity in the type of sentences can provide better results. But, even after using smoothing techniques most these taggers fail to provide correct results in the presence of unknown words. Considering these aspects, this paper proposes an Intelligent POS tagger for Hindi language based on VITERBI and K-Nearest Neighbour, capable of providing more accurate results than VITERBI in the presence of unknown words. [ABSTRACT FROM AUTHOR]
- Subjects :
- PARTS of speech
WORD order (Grammar)
HINDI language
PREPOSITIONS
ENGLISH language
Subjects
Details
- Language :
- English
- ISSN :
- 18770509
- Volume :
- 218
- Database :
- Supplemental Index
- Journal :
- Procedia Computer Science
- Publication Type :
- Academic Journal
- Accession number :
- 161583819
- Full Text :
- https://doi.org/10.1016/j.procs.2023.01.042