Back to Search Start Over

Detecting abusive Instagram comments in Turkish using convolutional Neural network and machine learning methods.

Authors :
Karayiğit, Habibe
İnan Acı, Çiğdem
Akdağlı, Ali
Source :
Expert Systems with Applications. Jul2021, Vol. 174, pN.PAG-N.PAG. 1p.
Publication Year :
2021

Abstract

• The first public dataset dedicated to detecting abusive Turkish messages. • 10,528 abusive, 19,826 not-abusive Instagram comments have been collected. • CNN, NB, SVM, DT, RF, LR, AdaBoost, and XGBoost classifiers were evaluated. • The best performance (F1-score: 0.974) was achieved by the CNN model. Instagram is a free photo-sharing platform where each user has a profile and can upload photos for followers to view, like, and comment. Abusive comments on images can be humiliating and harmful to those who share photos. Developing a comment filter in languages other than English is difficult and time-consuming. This paper proposes a dataset called Abusive Turkish Comments (ATC) to detect abusive Instagram comments in Turkish. It is composed of a large number of Instagram comments posted to tabloid and sports accounts (i.e., 10,528 abusive and 19,826 not-abusive). It is the first public dataset dedicated to detecting abusive Turkish messages, as far as we know. The sentiment annotation has been done in sentence-level by assigning polarity to each comment. The performance of the abusive message detection models was evaluated using several performance metrics: Convolutional Neural Network (CNN), five well-known classifiers (i.e., Naive Bayes, Support Vector Machine, Decision Tree, Random Forest, and Logistic Regression), and two reweighted classifiers (i.e., Adaptive Boosting (AdaBoost), eXtreme Gradient Boosting (XGBoost)) were compared in terms of F1-score, precision, and recall. The results showed that the best performance (i.e., Micro-averaged F1-score: 0.974, Macro-averaged F1-score: 0.973, Kappa-value: 0.946) was yielded by the CNN model on the oversampled ATC dataset. The abusive message detection model proposed in this study can contribute to the development of Turkish comment filters on Instagram. Different model combinations are considered to select the best model that gives better recognition accuracy. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09574174
Volume :
174
Database :
Academic Search Index
Journal :
Expert Systems with Applications
Publication Type :
Academic Journal
Accession number :
150231490
Full Text :
https://doi.org/10.1016/j.eswa.2021.114802