Back to Search
Start Over
A Speaker-Dependent Approach to Separation of Far-Field Multi-Talker Microphone Array Speech for Front-End Processing in the CHiME-5 Challenge
- Source :
- IEEE Journal of Selected Topics in Signal Processing. 13:827-840
- Publication Year :
- 2019
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2019.
-
Abstract
- We propose a novel speaker-dependent speech separation framework for the challenging CHiME-5 acoustic environments, exploiting advantages of both deep learning based and conventional preprocessing techniques to prepare data effectively for separating target speech from multi-talker mixed speech collected with multiple microphone arrays. First, a series of multi-channel operations is conducted to reduce existing reverberation and noise, and a single-channel deep learning based speech enhancement model is used to predict speech presence probabilities. Next, a two-stage supervised speech separation approach, using oracle speaker diarization information from CHiME-5, is proposed to separate speech of a target speaker from interference speakers in mixed speech. Given a set of three estimated masks of the background noise, the target speaker and the interference speakers from single-channel speech enhancement and separation models, a complex Gaussian mixture model based generalized eigenvalue beamformer is then used for enhancing the signal at the reference array while avoiding the speaker permutation issue. Furthermore, the proposed front-end can generate a large variety of processed data for an ensemble of speech recognition results. Experiments on the development set have shown that the proposed two-stage approach can yield significant improvements of recognition performance over the official baseline system and achieved top accuracies in all four competing evaluation categories among all systems submitted to the CHiME-5 Challenge.
- Subjects :
- Reverberation
Microphone array
Microphone
Computer science
business.industry
Deep learning
Speech recognition
020206 networking & telecommunications
02 engineering and technology
Background noise
Speaker diarisation
Speech enhancement
Noise
Signal Processing
0202 electrical engineering, electronic engineering, information engineering
Artificial intelligence
Electrical and Electronic Engineering
business
Subjects
Details
- ISSN :
- 19410484 and 19324553
- Volume :
- 13
- Database :
- OpenAIRE
- Journal :
- IEEE Journal of Selected Topics in Signal Processing
- Accession number :
- edsair.doi...........b792a1653cc70dd324fce2048937c85a
- Full Text :
- https://doi.org/10.1109/jstsp.2019.2920764