Back to Search Start Over

FA-GAN: Artifacts-free and Phase-aware High-fidelity GAN-based Vocoder

Authors :
Shen, Rubing
Ren, Yanzhen
Sun, Zongkun
Publication Year :
2024

Abstract

Generative adversarial network (GAN) based vocoders have achieved significant attention in speech synthesis with high quality and fast inference speed. However, there still exist many noticeable spectral artifacts, resulting in the quality decline of synthesized speech. In this work, we adopt a novel GAN-based vocoder designed for few artifacts and high fidelity, called FA-GAN. To suppress the aliasing artifacts caused by non-ideal upsampling layers in high-frequency components, we introduce the anti-aliased twin deconvolution module in the generator. To alleviate blurring artifacts and enrich the reconstruction of spectral details, we propose a novel fine-grained multi-resolution real and imaginary loss to assist in the modeling of phase information. Experimental results reveal that FA-GAN outperforms the compared approaches in promoting audio quality and alleviating spectral artifacts, and exhibits superior performance when applied to unseen speaker scenarios.

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2407.04575
Document Type :
Working Paper