1. An Efficient Language-Independent Acoustic Emotion Classification System
- Author
-
Rajwinder Singh, Varun Gupta, Naveen Aggarwal, and Harshita Puri
- Subjects
Multidisciplinary ,Computer science ,Speech recognition ,media_common.quotation_subject ,Emotion classification ,010102 general mathematics ,Ambiguity ,English language ,01 natural sciences ,language.human_language ,German ,language ,DECIPHER ,Emotion recognition ,0101 mathematics ,Feature set ,Classifier (UML) ,media_common - Abstract
Emotion recognition from human speech is essential to understand the convoluted human nature. For any machine to accurately decipher the intended message in the speech, it must understand the emotion of spoken words. Emotions control the modulations in the speech, and these modulations may even change the context. Through this paper, we aim to propose a system which can efficiently detect the emotions from speech. The domain of emotion recognition from human speech is very complex due to highly overlapping regions of emotions, and it sometimes becomes very difficult to distinguish between two emotions just based on voice. Such ambiguity in the label assignment is responsible for low classification accuracy in existing systems. In the proposed system, we have worked on finding both the suitable feature set as well as the classifier. The proposed system achieved 29.74% increase in classification accuracy in comparison with the baseline human accuracy on the primary dataset, i.e. ‘CREMA-D’. Further, we have validated on other standard datasets such as ‘EmoDB’, ‘RAVDESS’, and ‘SAVEE’. ‘EmoDB’ is a German language dataset, while the other two are English language datasets, which is in line with the language-independent nature of our system. When compared to the current state of the art in this domain on these datasets, the proposed system gives better accuracies for most of the cases, and for some cases, it gives comparable accuracies to baseline models or existing published work.
- Published
- 2019