Back to Search Start Over

KeyGAN: Synthetic keystroke data generation in the context of digital phenotyping.

Authors :
Acien A
Morales A
Giancardo L
Vera-Rodriguez R
Holmes AA
Fierrez J
Arroyo-Gallego T
Source :
Computers in biology and medicine [Comput Biol Med] 2025 Jan; Vol. 184, pp. 109460. Date of Electronic Publication: 2024 Nov 29.
Publication Year :
2025

Abstract

Objective: This paper aims to introduce and assess KeyGAN, a generative modeling-based keystroke data synthesizer. The synthesizer is designed to generate realistic synthetic keystroke data capturing the nuances of fine motor control and cognitive processes that govern finger-keyboard kinematics, thereby paving the way to support biomarker development for psychomotor impairment due to neurodegeneration.<br />Methods: KeyGAN is designed with two primary objectives: (i) to ensure high realism in the synthetic distributions of the keystroke features and (ii) to analyze its ability to replicate the subtleties of natural typing for enhancing biomarker development. The quality of synthetic keystroke data produced by KeyGAN is evaluated against two keystroke-based applications, TypeNet and nQiMechPD, employed as'referee' controls. The performance of KeyGAN is compared with a reference random Gaussian generator, testing its ability to fool the biometric authentication method TypeNet, and its ability to characterize fine motor impairment in Parkinson's Disease using nQiMechPD.<br />Results: KeyGAN outperformed the reference comparator in fooling the biometric authentication method TypeNet. It also exhibited a superior approximation to real data than the reference comparator when using nQiMechPD, showcasing its adaptability and versatility in mimicking early signs of Parkinson's Disease in natural typing. KeyGAN's synthetic data demonstrated that almost 20% of real PD samples could be replaced in the training set without a decline in classification performance on the real test set. Low Fréchet Distance (<0.03) and Kullback-Leibler Divergence (<700) between KeyGAN outputs and real data distributions underline the high performance of KeyGAN.<br />Conclusion: KeyGAN presents strong potential as a realistic keystroke data synthesizer, displaying impressive capability to reproduce complex typing patterns relevant to biomarkers for neurological disorders, like Parkinson's Disease. The ability of its synthetic data to effectively supplement real data for training algorithms without affecting performance implies significant promise for advancing research in digital biomarkers for neurodegenerative and psychomotor disorders.<br />Competing Interests: Declaration of competing interest A.A., and T. A.-G. are employees at Area2 Inc. and received a regular salary while contributing to the work. L.G. is an inventor on a patent currently licensed to Area2 Inc. in the same general research area of this work.<br /> (Copyright © 2024 The Authors. Published by Elsevier Ltd.. All rights reserved.)

Details

Language :
English
ISSN :
1879-0534
Volume :
184
Database :
MEDLINE
Journal :
Computers in biology and medicine
Publication Type :
Academic Journal
Accession number :
39615234
Full Text :
https://doi.org/10.1016/j.compbiomed.2024.109460