Back to Search Start Over

A Speaker-Dependent Approach to Separation of Far-Field Multi-Talker Microphone Array Speech for Front-End Processing in the CHiME-5 Challenge

Authors :
Lei Sun
Chin-Hui Lee
Feng Ma
Yi Fang
Tian Gao
Jun Du
Source :
IEEE Journal of Selected Topics in Signal Processing. 13:827-840
Publication Year :
2019
Publisher :
Institute of Electrical and Electronics Engineers (IEEE), 2019.

Abstract

We propose a novel speaker-dependent speech separation framework for the challenging CHiME-5 acoustic environments, exploiting advantages of both deep learning based and conventional preprocessing techniques to prepare data effectively for separating target speech from multi-talker mixed speech collected with multiple microphone arrays. First, a series of multi-channel operations is conducted to reduce existing reverberation and noise, and a single-channel deep learning based speech enhancement model is used to predict speech presence probabilities. Next, a two-stage supervised speech separation approach, using oracle speaker diarization information from CHiME-5, is proposed to separate speech of a target speaker from interference speakers in mixed speech. Given a set of three estimated masks of the background noise, the target speaker and the interference speakers from single-channel speech enhancement and separation models, a complex Gaussian mixture model based generalized eigenvalue beamformer is then used for enhancing the signal at the reference array while avoiding the speaker permutation issue. Furthermore, the proposed front-end can generate a large variety of processed data for an ensemble of speech recognition results. Experiments on the development set have shown that the proposed two-stage approach can yield significant improvements of recognition performance over the official baseline system and achieved top accuracies in all four competing evaluation categories among all systems submitted to the CHiME-5 Challenge.

Details

ISSN :
19410484 and 19324553
Volume :
13
Database :
OpenAIRE
Journal :
IEEE Journal of Selected Topics in Signal Processing
Accession number :
edsair.doi...........b792a1653cc70dd324fce2048937c85a
Full Text :
https://doi.org/10.1109/jstsp.2019.2920764