%0 Journal Article
%@ 1438-8871
%I JMIR Publications
%V 27
%N 
%P e63130
%T Generating Artificial Patients With Reliable Clinical Characteristics Using a Geometry-Based Variational Autoencoder: Proof-of-Concept Feasibility Study
%A Ferré,Fabrice
%A Allassonnière,Stéphanie
%A Chadebec,Clément
%A Minville,Vincent
%+ Department of Anesthesia, Intensive Care and Perioperative Medicine, Purpan University Hospital, Place du Dr Baylac, Toulouse, 31300, France, 33 561779988, fabriceferre31@gmail.com
%K digital health
%K artificial data
%K variational autoencoder
%K data science
%K artificial intelligence
%K health monitoring
%K deep learning
%K medical imaging
%K imaging
%K magnetic resonance imaging
%K Alzheimer disease
%K anesthesia
%K prediction
%K data augmentation
%D 2025
%7 17.4.2025
%9 Short Paper
%J J Med Internet Res
%G English
%X Background: Artificial patient technology could transform health care by accelerating diagnosis, treatment, and mapping clinical pathways. Deep learning methods for generating artificial data in health care include data augmentation by variational autoencoders (VAE) technology. Objective: We aimed to test the feasibility of generating artificial patients with reliable clinical characteristics by using a geometry-based VAE applied, for the first time, on high-dimension, low-sample-size tabular data. Methods: Clinical tabular data were extracted from 521 real patients of the “MAX” digital conversational agent (BOTdesign) created for preparing patients for anesthesia. A 3-stage methodological approach was implemented to generate up to 10,000 artificial patients: training the model and generating artificial data, assessing the consistency and confidentiality of artificial data, and validating the plausibility of the newly created artificial patients. Results: We demonstrated the feasibility of applying the VAE technique to tabular data to generate large artificial patient cohorts with high consistency (fidelity scores>94%). Moreover, artificial patients could not be matched with real patients (filter similarity scores>99%, κ coefficients of agreement<0.2), thus guaranteeing the essential ethical concern of confidentiality. Conclusions: This proof-of-concept study has demonstrated our ability to augment real tabular data to generate artificial patients. These promising results make it possible to envisage in silico trials carried out on large cohorts of artificial patients, thereby overcoming the pitfalls usually encountered in in vivo trials. Further studies integrating longitudinal dynamics are needed to map patient trajectories. 
%R 10.2196/63130
%U https://www.jmir.org/2025/1/e63130
%U https://doi.org/10.2196/63130