1. A survey of hate speech detection in Indian languages.
- Author
-
Nandi, Arpan, Sarkar, Kamal, Mallick, Arjun, and De, Arkadeep
- Abstract
With the enormous increase in accessibility of high-speed internet, the number of social media users is increasing rapidly. Due to a lack of proper regulations and ethics, social media platforms are often contaminated by posts and comments containing abusive language and offensive remarks toward individuals, groups, races, religions, and communities. A single remark often triggers a huge chain of reactions with similar abusiveness, or even more. To prevent such occurrences, there is a need for automated systems that can detect abusive texts and hate speeches and remove them immediately. However, most existing research works are limited only to globally popular languages like English. Since India is a nation of many diverse languages and multiple religions, nowadays abusive posts and remarks in Indian languages (monolingual or code-mixed form) are not infrequent on social media platforms. Although resources such as hate speech lexicon and annotated datasets are limited for Indian languages, most research works on hate speech detection in such languages used traditional machine learning and deep learning methods for this task. However, multilingualism and code-mixing make hate speech detection in Indian languages more challenging. Given these facts, this paper mainly focuses on reviewing the latest impactful research works on hate speech detection in Indian languages. In this paper, we have analyzed and compared the latest research works on hate speech detection in Indian languages in terms of various aspects—datasets used, feature extraction and classification methods applied, and the results achieved. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF