Back to Search
Start Over
Segmentation of Printed Devnagari Documents
- Source :
- Advances in Computing and Information Technology ISBN: 9783642225543
- Publication Year :
- 2011
- Publisher :
- Springer Berlin Heidelberg, 2011.
-
Abstract
- Document segmentation is one of the most important phases in machine recognition of any language. Correct segmentation of individual symbols decides the success of character recognition technique. It is used to decompose an image of a sequence of characters into sub images of individual symbols by segmenting lines and words. Devnagari is the most popular script in India. It is used for writing Hindi, Marathi, Sanskrit and Nepali languages. Moreover, Hindi is the third most popular language in the world. Devnagari documents consist of vowels, consonants and various modifiers. Hence a proper segmentation Devnagari word is challenging. A simple approach based on bounded box to segment Devnagari documents is proposed in this paper. Various challenges in segmentation of Devnagari script are also discussed.
- Subjects :
- Hindi
Nepali
business.industry
Computer science
Text segmentation
ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION
computer.software_genre
language.human_language
Market segmentation
language
Segmentation
Artificial intelligence
Marathi
business
Sanskrit
computer
Natural language processing
Word (computer architecture)
Subjects
Details
- ISBN :
- 978-3-642-22554-3
- ISBNs :
- 9783642225543
- Database :
- OpenAIRE
- Journal :
- Advances in Computing and Information Technology ISBN: 9783642225543
- Accession number :
- edsair.doi...........5573ae03e2e25622d52ed8c08f9fe5c7
- Full Text :
- https://doi.org/10.1007/978-3-642-22555-0_23