Back to Search Start Over

Determining Distinct Suicide Attempts From Recurrent Electronic Health Record Codes: Classification Study

Authors :
Kate H Bentley
Emily M Madsen
Eugene Song
Yu Zhou
Victor Castro
Hyunjoon Lee
Younga H Lee
Jordan W Smoller
Source :
JMIR Formative Research, Vol 8, p e46364 (2024)
Publication Year :
2024
Publisher :
JMIR Publications, 2024.

Abstract

BackgroundPrior suicide attempts are a relatively strong risk factor for future suicide attempts. There is growing interest in using longitudinal electronic health record (EHR) data to derive statistical risk prediction models for future suicide attempts and other suicidal behavior outcomes. However, model performance may be inflated by a largely unrecognized form of “data leakage” during model training: diagnostic codes for suicide attempt outcomes may refer to prior attempts that are also included in the model as predictors. ObjectiveWe aimed to develop an automated rule for determining when documented suicide attempt diagnostic codes identify distinct suicide attempt events. MethodsFrom a large health care system’s EHR, we randomly sampled suicide attempt codes for 300 patients with at least one pair of suicide attempt codes documented at least one but no more than 90 days apart. Supervised chart reviewers assigned the clinical settings (ie, emergency department [ED] versus non-ED), methods of suicide attempt, and intercode interval (number of days). The probability (or positive predictive value) that the second suicide attempt code in a given pair of codes referred to a distinct suicide attempt event from its preceding suicide attempt code was calculated by clinical setting, method, and intercode interval. ResultsOf 1015 code pairs reviewed, 835 (82.3%) were nonindependent (ie, the 2 codes referred to the same suicide attempt event). When the second code in a pair was documented in a clinical setting other than the ED, it represented a distinct suicide attempt 3.3% of the time. The more time elapsed between codes, the more likely the second code in a pair referred to a distinct suicide attempt event from its preceding code. Code pairs in which the second suicide attempt code was assigned in an ED at least 5 days after its preceding suicide attempt code had a positive predictive value of 0.90. ConclusionsEHR-based suicide risk prediction models that include International Classification of Diseases codes for prior suicide attempts as a predictor may be highly susceptible to bias due to data leakage in model training. We derived a simple rule to distinguish codes that reflect new, independent suicide attempts: suicide attempt codes documented in an ED setting at least 5 days after a preceding suicide attempt code can be confidently treated as new events in EHR-based suicide risk prediction models. This rule has the potential to minimize upward bias in model performance when prior suicide attempts are included as predictors in EHR-based suicide risk prediction models.

Subjects

Subjects :
Medicine

Details

Language :
English
ISSN :
2561326X
Volume :
8
Database :
Directory of Open Access Journals
Journal :
JMIR Formative Research
Publication Type :
Academic Journal
Accession number :
edsdoj.459c9f254fec45bbb9ab2a20cae30586
Document Type :
article
Full Text :
https://doi.org/10.2196/46364