Start Over

Deep Learning-based Hate Speech Detection in Code-mixed Tamil Text.

Authors :: Anbukkarasi, S.
Varadhaganapathy, S.
Source :: IETE Journal of Research. Nov2023, Vol. 69 Issue 11, p7893-7898. 6p.
Publication Year :: 2023
Abstract: Social media is a great source of communication. People use various social media platforms, such as Twitter, Facebook, and Instagram, for sharing their ideas, opinions, and feelings. Users of different age groups, cultures, education backgrounds manipulate these powerful mediums of communication. Even though it gives all the benefits of knowledge sharing among the users, it has a dark side too. Despite setting restrictions from the corresponding sites, many users use abusive language to blemish the status and image of someone. So it is highly the need of the hour for the government or the particular social media platform to sift out those unwanted hate texts before diffusing them. Finding the hate text is one of the emerging research topics in Natural Language Processing where the model predicts the given text as hate text or not. This automated hate text detection becomes tedious when we consider the Indian languages due to a lack of data. Moreover, Indian people are multilingual and use code-mixed patterns to express their thoughts. The unavailability of the annotated Tamil-English dataset and the lack of a standard model make this task more challenging. In our paper, to handle such code-mixed data, a dataset is created with 10000 Tamil-English code-mixed texts collected from Twitter. These are annotated as hate text/non-hate text. In this paper, we use a synonym-based Bi-LSTM model for classifying hate non-hate text in tweets. [ABSTRACT FROM AUTHOR]