Back to Search Start Over

Synthesis and quality assessment of combined time-series and static medical data using a real-world time-series generative adversarial network

Authors :
Jaewon Kim
Hyunwoo Choo
Soo-Yong Shin
Kyoung Doo Song
Source :
Scientific Reports, Vol 14, Iss 1, Pp 1-9 (2024)
Publication Year :
2024
Publisher :
Nature Portfolio, 2024.

Abstract

Abstract This study addresses challenges related to privacy issues in utilizing medical data, particularly the protection of personal information. To overcome this obstacle, the research focuses on data synthesis using real-world time-series generative adversarial networks (RTSGAN). A total of 53,005 data were synthesized using the dataset of 15,799 patients with colorectal cancer. The results of the quantitative evaluation of the synthetic data’s quality are as follows: the Hellinger distance ranged from 0 to 0.25; the train on synthetic, test on real (TSTR) and train on real, test on synthetic (TRTS) results showed an average area under the curve of 0.99 and 0.98; a propensity mean squared error was 0.223. The synthetic and real data were similar in the qualitative methods including t-SNE and histogram analyses. The application of synthetic data in predicting five-year survival in colorectal cancer patients demonstrates comparable performance to models based on real data. This study employs distance to closest records and membership inference test to assess potential privacy exposure, revealing minimal risk. This study demonstrated that it is feasible to synthesize medical data, including time-series data, using the RTSGAN, and the synthetic data can be evaluated to accurately reflect the characteristics of real data through quantitative and qualitative methods as well as by utilizing real-world artificial intelligence models.

Details

Language :
English
ISSN :
20452322
Volume :
14
Issue :
1
Database :
Directory of Open Access Journals
Journal :
Scientific Reports
Publication Type :
Academic Journal
Accession number :
edsdoj.8f43a7f4e2b041709f200ec4820c628c
Document Type :
article
Full Text :
https://doi.org/10.1038/s41598-024-69812-7