Back to Search
Start Over
Comparing spectrum estimators in speaker verification under additive noise degradation
- Source :
- Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2012, pp. 4769-4772, Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2012, 4769-4772. Kyoto, Japan : IEEE, STARTPAGE=4769;ENDPAGE=4772;TITLE=Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2012, ICASSP
- Publication Year :
- 2012
- Publisher :
- IEEE, 2012.
-
Abstract
- Bu çalışma, 25-30 Mart 2012 tarihleri arasında Kyoto[Japonya]’da düzenlenen IEEE International Conference on Acoustics, Speech and Signal Processing’da bildiri olarak sunulmuştur. Different short-term spectrum estimators for speaker verification under additive noise are considered. Conventionally, mel-frequency cepstral coefficients (MFCCs) are computed from discrete Fourier transform (DFT) spectra of windowed speech frames. Recently, linear prediction (LP) and its temporally weighted variants have been substituted as the spectrum analysis method in speech and speaker recognition. In this paper, 12 different short-term spectrum estimation methods are compared for speaker verification under additive noise contamination. Experimental results conducted on NIST 2002 SRE show that the spectrum estimation method has a large effect on recognition performance and stabilized weighted LP (SWLP) and minimum variance distortionless response (MVDR) methods yield approximately 7 % and 8 % relative improvements over the standard DFT method at -10 dB SNR level of factory and babble noises, respectively in terms of equal error rate (EER). Inst Elect & Elect Engineers, Signal Processing Soc IEEE
- Subjects :
- Computer science
Speech recognition
Noise degradations
Word error rate
Linear prediction
02 engineering and technology
Engineering, electrical & electronic
Signal-to-noise ratio
Engineering
Babble noise
Spectrum estimation
Equal error rate
0202 electrical engineering, electronic engineering, information engineering
GeneralLiterature_REFERENCE(e.g.,dictionaries,encyclopedias,glossaries)
ta213
BBfor2 Cohesion
Estimator
Speaker recognition
Dft method
Fourier transforms
Speaker verification
Speaker Verification
Language Recognition
Utterance
Mel-frequency cepstrum
Speech frames
0305 other medical science
Signal processing
Mel-frequency cepstral coefficients
Noise contamination
Discrete Fourier transform
030507 speech-language pathology & audiology
03 medical and health sciences
Speech
noise robust
ta113
ta114
Minimum variance distortionless response
ta111
020206 networking & telecommunications
Acoustics
Weighted linear prediction
Spectrum analysis
Noise
Recognition
Recognition performance
Additive noise
Discrete
Language & Speech Technology
Spectrum estimators
Acoustic noise
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2012, pp. 4769-4772, Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2012, 4769-4772. Kyoto, Japan : IEEE, STARTPAGE=4769;ENDPAGE=4772;TITLE=Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2012, ICASSP
- Accession number :
- edsair.doi.dedup.....ada7b527c0749f5c6d8b48e1a1fa32cc