Back to Search Start Over

Improving GAN with inverse cumulative distribution function for tabular data synthesis

Authors :
Limin Pan
Qin Xiaonan
Senlin Luo
Li Ban
Source :
Neurocomputing. 456:373-383
Publication Year :
2021
Publisher :
Elsevier BV, 2021.

Abstract

Designing a generative model to synthesize realistic tabular data is of great significance in data science. Existing tabular data generative models have difficulty in handling complicated and diverse marginal distribution types due to the gradient vanishing problem, and these models pay little attention to the correlation between attributes. We propose a method that improves the generative adversarial network (GAN) with inverse cumulative distribution function for tabular data synthesis. This method first transforms continuous columns into uniform distribution data by using the cumulative distribution function, which can alleviate the gradient vanishing problem in model training. Then the method trains GAN with the transformed data, where the discriminator with label reconstruction function is presented to model the correlation among attributes accurately by introducing an auxiliary supervised task to help the correlations extraction. After that, we train a neural network for each continuous column to perform the inverse transformation of generated data into the target distribution, thereby the synthetic data is obtained. Experiments on simulated and real-world datasets show that our method compares favorably against the state-of-the-art methods in modeling tabular data.

Details

ISSN :
09252312
Volume :
456
Database :
OpenAIRE
Journal :
Neurocomputing
Accession number :
edsair.doi...........775a24c1c2d8bfab2af8628b14581597