Back to Search
Start Over
Annotating Sanskrit Corpus: Adapting IL-POSTS
- Source :
- Human Language Technology. Challenges for Computer Science and Linguistics ISBN: 9783642200946, LTC
- Publication Year :
- 2011
- Publisher :
- Springer Berlin Heidelberg, 2011.
-
Abstract
- In this paper we present an experiment on the use of the hierarchical Indic Languages POS Tagset (IL-POSTS) (Baskaran et al 2008 a&b), developed by Microsoft Research India (MSRI) for tagging Indian languages, for annotating Sanskrit corpus. Sanskrit is a language with richer morphology and relatively free word-order. The authors have included and excluded certain tags according to the requirements of the Sanskrit data. A revision to the annotation guidelines done for IL-POSTS is also presented. The authors also present an experiment of training the tagger at MSRI and documenting the results.
Details
- ISBN :
- 978-3-642-20094-6
- ISBNs :
- 9783642200946
- Database :
- OpenAIRE
- Journal :
- Human Language Technology. Challenges for Computer Science and Linguistics ISBN: 9783642200946, LTC
- Accession number :
- edsair.doi...........188749d5727780d7406d62a55feb6b23
- Full Text :
- https://doi.org/10.1007/978-3-642-20095-3_34