Deep unfolding for multichannel source separation

Authors :: Scott Wisdom
Shinji Watanabe
John R. Hershey
Jonathan Le Roux
Source :: ICASSP
Publication Year :: 2016
Publisher :: IEEE, 2016.
Abstract: Deep unfolding has recently been proposed to derive novel deep network architectures from model-based approaches. In this paper, we consider its application to multichannel source separation. We unfold a multichannel Gaussian mixture model (MCGMM), resulting in a deep MCGMM computational network that directly processes complex-valued frequency-domain multichannel audio and has an architecture defined explicitly by a generative model, thus combining the advantages of deep networks and model-based approaches. We further extend the deep MCGMM by modeling the GMM states using an MRF, whose unfolded mean-field inference updates add dynamics across layers. Experiments on source separation for multichannel mixtures of two simultaneous speakers shows that the deep MCGMM leads to improved performance with respect to the original MCGMM model.

Subjects :: Network architecture
Markov random field
Computer science
business.industry
Inference
020206 networking & telecommunications
Pattern recognition
02 engineering and technology
Mixture model
030507 speech-language pathology & audiology
03 medical and health sciences
Generative model
0202 electrical engineering, electronic engineering, information engineering
Source separation
Artificial intelligence
0305 other medical science
business

Database :: OpenAIRE
Journal :: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Accession number :: edsair.doi...........92f411b7811bc7e79304e9a15cdb41c7
Full Text :: https://doi.org/10.1109/icassp.2016.7471649