1. A Hybrid Ensemble Approach for Greek Text Classification Based on Multilingual Models.
- Author
-
Liapis, Charalampos M., Kyritsis, Konstantinos, Perikos, Isidoros, Spatiotis, Nikolaos, and Paraskevas, Michael
- Subjects
MACHINE learning ,K-nearest neighbor classification ,GREEK language ,TRANSFORMER models ,LOGISTIC regression analysis - Abstract
The present study explores the field of text classification in the Greek language. A novel ensemble classification scheme based on generated embeddings from Greek text made by the multilingual capabilities of the E5 model is presented. Our approach incorporates partial transfer learning by using pre-trained models to extract embeddings, enabling the evaluation of classical classifiers on Greek data. Additionally, we enhance the predictive capability while maintaining the costs low by employing a soft voting combination scheme that exploits the strengths of XGBoost, K-nearest neighbors, and logistic regression. This method significantly improves all classification metrics, demonstrating the superiority of ensemble techniques in handling the complexity of Greek textual data. Our study contributes to the field of natural language processing by proposing an effective ensemble framework for the categorization of Greek texts, leveraging the advantages of both traditional and modern machine learning techniques. This framework has the potential to be applied to other less-resourced languages, thereby broadening the impact of our research beyond Greek language processing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF