1. IndoBERT for classifying hate speech in Twitter.
- Author
-
Santosa, Hendri, Rachman, Fatahillah, Austen, Stanley Armando, Christianto, and Girsang, Abba Suganda
- Subjects
HATE speech ,ETHNICITY ,LANGUAGE models ,DISCRIMINATORY language ,INCITEMENT to violence ,TRANSFORMER models - Abstract
Any form of communication that expresses hatred, prejudice, or hostility toward a particular individual or group of people based on attributes such as their race, religion, ethnicity, nationality, gender, sexual orientation, disability, or other protected characteristics is considered hate speech. Hate speech can be verbal, written, or symbolic. Hate speech can take many forms, and it often involves derogatory language, offensive stereotypes, or the incitement of violence or discrimination against the targeted individuals or groups. The content of hate speech is easy found in forum or discussion in social media include twitter. Twitter is a microblogging-based virtual entertainment where clients can peruse and compose text called tweets or tweets. This exploration executes order of disdain discourse in media Twitter utilizing IndoBERT. IndoBERT is the Indonesian form of BERT model utilizing over 220M words. It was a Convolutional Neural Network-based algorithm that had been modified. Th highlight extraction in Transformer isn't finished by convolution utilizing a part like CNN, however includes an encoder and decoder. The outcome demonstrates IndoBERT's excellent ability to categorize hate speech. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF