Back to Search Start Over

An Innovative Method for Speech Signal Emotion Recognition Based on Spectral Features Using GMM and HMM Techniques.

Authors :
Al-Dujaili Al-Khazraji, Mohammed Jawad
Ebrahimi-Moghadam, Abbas
Source :
Wireless Personal Communications; Jan2024, Vol. 134 Issue 2, p735-753, 19p
Publication Year :
2024

Abstract

Speech is one of the communication processes of humans. One of the important features of speech is to convey the inner feelings of the person to the listener. When a speech is expressed by the speaker, this speech also contains the feelings of the person, which leads to the creation of thoughts and behaviors appropriate to oneself. Speech Emotion Recognition (SER) is a very important issue in the field of human–machine interaction. The expansion of the use of computers and its impact on today's life has caused this mutual cooperation between man and machine to be widely investigated and researched. In this article, SER in English and Persian has been examined. Frequency time characteristics such as Mel- Frequency Cepstral Coefficient (MFCC), Linear Predictive Coding and Predictive Linear Perceptual (PLP) are extracted from the data as feature vectors, then they are combined with each other and a selection of suitable features from them. Also, Principal components analysis (PCA) is used to reduce dimensions and eliminate redundancy while retaining most of the intrinsic information content of the pattern. Then, each emotional state was classified using the Gaussian Mixtures Model (GMM) and Hidden Markov Model (HMM) technique. Combining the MFCC + PLP properties, PCA features, and HMM classification with a precision of 88.85% and a runtime of 0.3 s produces the average diagnostic rate in the English database; similarly, the PLP properties, PCA features, and HMM classification with a precision of 90.21% and a runtime of 0.4 s produce the average diagnostic rate in the Persian database. Based on the combination of features and classifications, the experimental results demonstrated that the suggested approach can attain a high level of stable detection performance for every emotional state. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09296212
Volume :
134
Issue :
2
Database :
Complementary Index
Journal :
Wireless Personal Communications
Publication Type :
Academic Journal
Accession number :
176610075
Full Text :
https://doi.org/10.1007/s11277-024-10918-6