Back to Search
Start Over
Offline handwritten Gurumukhi word recognition using eXtreme Gradient Boosting methodology
- Source :
- Soft Computing. 25:4451-4464
- Publication Year :
- 2020
- Publisher :
- Springer Science and Business Media LLC, 2020.
-
Abstract
- Handwritten word recognition is undoubtedly a challenging task due to various writing styles of individuals. So, lots of efforts are put to recognize handwritten words using efficient classifiers based on extracted features that rely on the visual appearance of the handwritten text. Due to numerous real-time applications, handwritten word recognition is an important research area which is seeking a lot of attention from researchers for the last 10 years. In this article, the authors have proposed a holistic approach and eXtreme Gradient Boosting (XGBoost) technique to recognize offline handwritten Gurumukhi words. In this direction, four state-of-the-art features like zoning, diagonal, intersection & open-end points and peak extent features have been considered to extract discriminant features from the handwritten word digital images. The proposed approach is evaluated on a public benchmark dataset of Gurumukhi script that comprises 40,000 samples of handwritten words. Based on extracted features, the words are classified into one of the 100 classes based on XGBoost technique. Effectiveness of the system is assessed based on several evaluation parameters like CPU elapsed time, accuracy, precision, recall, F1-score and area under curve (AUC). XGBoost technique attained the best results of accuracy (91.66%), recall (91.66%), precision (91.39%), F1-score (91.14%) and AUC (95.66%) using zoning features based on 90% data as the training set and remaining 10% data as the testing set. The comparison of the proposed approach with the existing approaches has also been done which reveals the significance of the XGBoost technique comparatively.
- Subjects :
- 0209 industrial biotechnology
Training set
business.industry
Computer science
Intersection (set theory)
Computational intelligence
Pattern recognition
02 engineering and technology
Theoretical Computer Science
Set (abstract data type)
Digital image
ComputingMethodologies_PATTERNRECOGNITION
020901 industrial engineering & automation
Word recognition
0202 electrical engineering, electronic engineering, information engineering
Benchmark (computing)
020201 artificial intelligence & image processing
Geometry and Topology
Artificial intelligence
business
Software
Word (computer architecture)
Subjects
Details
- ISSN :
- 14337479 and 14327643
- Volume :
- 25
- Database :
- OpenAIRE
- Journal :
- Soft Computing
- Accession number :
- edsair.doi...........958c2332028fcb20529b3e6430bdfa88
- Full Text :
- https://doi.org/10.1007/s00500-020-05455-w