Back to Search
Start Over
Virality classification from Twitter data using pre-trained language model and multi-layer perceptron.
- Source :
- Indonesian Journal of Electrical Engineering & Computer Science; Sep2024, Vol. 35 Issue 3, p1952-1962, 11p
- Publication Year :
- 2024
-
Abstract
- Twitter is one of the well-known text-based social media that is often used to disseminate content. According to Katadata, Indonesia ranked fifth in the world in 2023. So many people or organizations want to make tweets go viral. Therefore, this research aims to develop a model that uses tweet data from the Indonesian language Twitter social media to categorize the level of virality. There are several tasks in classifying the level of virality, such as upsampling data, predicting sentiment and emotion, and text embedding. Upsampling data was carried out because the dataset used was an imbalanced dataset. Data upsampling, emotions, and text embedding is carried out using the bidirectional encoder representation from transformers (BERT) model. Meanwhile, sentiment prediction uses the Ro- bustly optimized BERT pretraining approach (RoBERTa). The results of text embedding, sentiment, emotion, will be combined with Twitter metadata then all features will be fed into the multi-layer perceptron (MLP) model to classifying the level of virality which is divided into 3 classes based on the number of retweets, namely low, medium and high. The proposed method produces an F1-score of 49% and an accuracy of 95% and performs better than the baseline model. [ABSTRACT FROM AUTHOR]
- Subjects :
- LANGUAGE models
INDONESIAN language
SOCIAL media
METADATA
Subjects
Details
- Language :
- English
- ISSN :
- 25024752
- Volume :
- 35
- Issue :
- 3
- Database :
- Complementary Index
- Journal :
- Indonesian Journal of Electrical Engineering & Computer Science
- Publication Type :
- Academic Journal
- Accession number :
- 179115035
- Full Text :
- https://doi.org/10.11591/ijeecs.v35.i3.pp1952-1962