Back to Search Start Over

An improved sentiment classification model based on data quality and word embeddings.

Authors :
Siagh, Asma
Laallam, Fatima Zohra
Kazar, Okba
Salem, Hajer
Source :
Journal of Supercomputing. Jul2023, Vol. 79 Issue 11, p11871-11894. 24p.
Publication Year :
2023

Abstract

User-generated content on social media platforms has reached big data levels. Sentiment analysis of this data provides opportunities to gain valuable insights into any domain. However, analyzing real-world data may confront the challenge of class imbalance, which can adversely affect the generalization ability of models due to majority class overfitting. Therefore, having an efficient model that manages any scenario of imbalanced data is practically needed. In this light, this work proposes different models based on studying the impact of data quality and transfer learning through pre-trained embeddings on boosting minority class detection. The proposed models are tested on imbalanced datasets related to social media and education. The experimental results highlight the effectiveness of Wor2vec, Glove, and Fasttext embeddings with preprocessed data. In contrast, BERT embeddings present better results with no-preprocessed data. Furthermore, in comparison with other methods, the best-performing model resulting from this study shows outperformance with notable improvements. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09208542
Volume :
79
Issue :
11
Database :
Academic Search Index
Journal :
Journal of Supercomputing
Publication Type :
Academic Journal
Accession number :
164225554
Full Text :
https://doi.org/10.1007/s11227-023-05099-1