Back to Search Start Over

A Multi-Resolution Approach to GAN-Based Speech Enhancement

Authors :
Hyung Yong Kim
Ji Won Yoon
Sung Jun Cheon
Woo Hyun Kang
Nam Soo Kim
Source :
Applied Sciences, Vol 11, Iss 2, p 721 (2021)
Publication Year :
2021
Publisher :
MDPI AG, 2021.

Abstract

Recently, generative adversarial networks (GANs) have been successfully applied to speech enhancement. However, there still remain two issues that need to be addressed: (1) GAN-based training is typically unstable due to its non-convex property, and (2) most of the conventional methods do not fully take advantage of the speech characteristics, which could result in a sub-optimal solution. In order to deal with these problems, we propose a progressive generator that can handle the speech in a multi-resolution fashion. Additionally, we propose a multi-scale discriminator that discriminates the real and generated speech at various sampling rates to stabilize GAN training. The proposed structure was compared with the conventional GAN-based speech enhancement algorithms using the VoiceBank-DEMAND dataset. Experimental results showed that the proposed approach can make the training faster and more stable, which improves the performance on various metrics for speech enhancement.

Details

Language :
English
ISSN :
20763417
Volume :
11
Issue :
2
Database :
Directory of Open Access Journals
Journal :
Applied Sciences
Publication Type :
Academic Journal
Accession number :
edsdoj.6198c808ecdc45efb3b636ea4dbd8a2d
Document Type :
article
Full Text :
https://doi.org/10.3390/app11020721