Back to Search
Start Over
Time-domain speech enhancement using generative adversarial networks
- Source :
- UPCommons. Portal del coneixement obert de la UPC, Universitat Politècnica de Catalunya (UPC)
- Publication Year :
- 2019
-
Abstract
- Speech enhancement improves recorded voice utterances to eliminate noise that might be impeding their intelligibility or compromising their quality. Typical speech enhancement systems are based on regression approaches that subtract noise or predict clean signals. Most of them do not operate directly on waveforms. In this work, we propose a generative approach to regenerate corrupted signals into a clean version by using generative adversarial networks on the raw signal. We also explore several variations of the proposed system, obtaining insights into proper architectural choices for an adversarially trained, convolutional autoencoder applied to speech. We conduct both objective and subjective evaluations to assess the performance of the proposed method. The former helps us choose among variations and better tune hyperparameters, while the latter is used in a listening experiment with 42 subjects, confirming the effectiveness of the approach in the real world. We also demonstrate the applicability of the approach for more generalized speech enhancement, where we have to regenerate voices from whispered signals.
- Subjects :
- Linguistics and Language
Computer science
Speech recognition
Speech enhancement
Reconeixement automàtic de la parla
02 engineering and technology
Intelligibility (communication)
01 natural sciences
Language and Linguistics
Neural networks (Computer science)
0103 physical sciences
0202 electrical engineering, electronic engineering, information engineering
Xarxes neuronals (Informàtica)
Active listening
Time domain
Speech processing systems
010301 acoustics
Hyperparameter
Artificial neural network
Communication
020206 networking & telecommunications
Enginyeria de la telecomunicació [Àrees temàtiques de la UPC]
Autoencoder
Computer Science Applications
Audio transformation
Modeling and Simulation
Processament de la parla
Computer Vision and Pattern Recognition
Generative adversarial network
Software
Generative grammar
Neural networks
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- UPCommons. Portal del coneixement obert de la UPC, Universitat Politècnica de Catalunya (UPC)
- Accession number :
- edsair.doi.dedup.....047a33f55e2d2e1188f95ffaa77ba5f4