Back to Search Start Over

A two-stage complex network using cycle-consistent generative adversarial networks for speech enhancement.

Authors :
Yu, Guochen
Wang, Yutian
Wang, Hui
Zhang, Qin
Zheng, Chengshi
Source :
Speech Communication. Nov2021, Vol. 134, p42-54. 13p.
Publication Year :
2021

Abstract

Cycle-consistent generative adversarial networks (CycleGAN) have shown their promising performance for speech enhancement (SE), while one intractable shortcoming of these CycleGAN-based SE systems is that the noise components propagate throughout the cycle and cannot be completely eliminated. Additionally, conventional CycleGAN-based SE systems only estimate the spectral magnitude, while the phase is unaltered. Motivated by the multi-stage learning concept, we propose a novel two-stage denoising system that combines a CycleGAN-based magnitude enhancing network and a subsequent complex spectral refining network in this paper. Specifically, in the first stage, a CycleGAN-based model is responsible for only estimating magnitude, which is subsequently coupled with the original noisy phase to obtain a coarsely enhanced complex spectrum. After that, the second stage is applied to further suppress the residual noise components and estimate the clean phase by a complex spectral mapping network, which is a pure complex-valued network composed of complex 2D convolution/deconvolution and complex temporal-frequency attention blocks. Experimental results on two public datasets demonstrate that the proposed approach consistently surpasses previous one-stage CycleGANs and other state-of-the-art SE systems in terms of various evaluation metrics, especially in background noise suppression. • We propose a two-stage speech enhancement approach composed of a CycleGAN-based network and a complex denoising network. • We decompose the original complex spectrum estimation into two sub-tasks, i.e., maginitude and phase. • The proposed two-stage system outperforms the one-stage CycleGAN-based and many state-of-the-art speech enhancement approaches. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01676393
Volume :
134
Database :
Academic Search Index
Journal :
Speech Communication
Publication Type :
Academic Journal
Accession number :
152794073
Full Text :
https://doi.org/10.1016/j.specom.2021.09.001