Back to Search Start Over

Development and multi-site external validation of a generalizable risk prediction model for bipolar disorder

Authors :
Colin G. Walsh
Michael A. Ripperger
Yirui Hu
Yi-han Sheu
Hyunjoon Lee
Drew Wilimitis
Amanda B. Zheutlin
Daniel Rocha
Karmel W. Choi
Victor M. Castro
H. Lester Kirchner
Christopher F. Chabris
Lea K. Davis
Jordan W. Smoller
Source :
Translational Psychiatry, Vol 14, Iss 1, Pp 1-7 (2024)
Publication Year :
2024
Publisher :
Nature Publishing Group, 2024.

Abstract

Abstract Bipolar disorder is a leading contributor to disability, premature mortality, and suicide. Early identification of risk for bipolar disorder using generalizable predictive models trained on diverse cohorts around the United States could improve targeted assessment of high risk individuals, reduce misdiagnosis, and improve the allocation of limited mental health resources. This observational case-control study intended to develop and validate generalizable predictive models of bipolar disorder as part of the multisite, multinational PsycheMERGE Network across diverse and large biobanks with linked electronic health records (EHRs) from three academic medical centers: in the Northeast (Massachusetts General Brigham), the Mid-Atlantic (Geisinger) and the Mid-South (Vanderbilt University Medical Center). Predictive models were developed and valid with multiple algorithms at each study site: random forests, gradient boosting machines, penalized regression, including stacked ensemble learning algorithms combining them. Predictors were limited to widely available EHR-based features agnostic to a common data model including demographics, diagnostic codes, and medications. The main study outcome was bipolar disorder diagnosis as defined by the International Cohort Collection for Bipolar Disorder, 2015. In total, the study included records for 3,529,569 patients including 12,533 cases (0.3%) of bipolar disorder. After internal and external validation, algorithms demonstrated optimal performance in their respective development sites. The stacked ensemble achieved the best combination of overall discrimination (AUC = 0.82–0.87) and calibration performance with positive predictive values above 5% in the highest risk quantiles at all three study sites. In conclusion, generalizable predictive models of risk for bipolar disorder can be feasibly developed across diverse sites to enable precision medicine. Comparison of a range of machine learning methods indicated that an ensemble approach provides the best performance overall but required local retraining. These models will be disseminated via the PsycheMERGE Network website.

Details

Language :
English
ISSN :
21583188
Volume :
14
Issue :
1
Database :
Directory of Open Access Journals
Journal :
Translational Psychiatry
Publication Type :
Academic Journal
Accession number :
edsdoj.94f1cfffc2144cf09173606b3f636a0f
Document Type :
article
Full Text :
https://doi.org/10.1038/s41398-023-02720-y