Back to Search Start Over

Novel feature representation using single frequency filtering and nonlinear energy operator for speech emotion recognition.

Authors :
Thirumuru, Ramakrishna
Gurugubelli, Krishna
Vuppala, Anil Kumar
Source :
Digital Signal Processing. Jan2022, Vol. 120, pN.PAG-N.PAG. 1p.
Publication Year :
2022

Abstract

In this paper, the intrinsic characteristics of speech modulations are estimated to propose the instant modulation spectral features for efficient emotion recognition. This feature representation is based on single frequency filtering (SFF) technique and higher order nonlinear energy operator. The speech signal is decomposed into frequency sub-bands using SFF, and associated nonlinear energies are estimated with higher order nonlinear energy operator. Then, the feature vector is realized using cepstral analysis. The high-resolution property of SFF technique is exploited to extract the amplitude envelope of the speech signal at a selected frequency with good time-frequency resolution. The fourth order nonlinear energy operator provides noise robustness in estimating the modulation components. The proposed feature set is tested for the emotion recognition task using the i-vector model with the probabilistic linear discriminant scoring scheme, support vector machine and random forest classifiers. The results demonstrate that the performance of this feature representation is better than the widely used spectral and prosody features, achieving detection accuracy of 85.75% , 59.88% , and 65.78% on three emotional databases, EMODB, FAU-AIBO, and IEMOCAP, respectively. Further, the proposed features are found to be robust in the presence of additive white Gaussian and vehicular noises. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10512004
Volume :
120
Database :
Academic Search Index
Journal :
Digital Signal Processing
Publication Type :
Periodical
Accession number :
153903214
Full Text :
https://doi.org/10.1016/j.dsp.2021.103293