1. Natural language processing to identify suicidal ideation and anhedonia in major depressive disorder
- Author
-
L. Alexander Vance, Leslie Way, Deepali Kulkarni, Emily O. C. Palmer, Abhijit Ghosh, Melissa Unruh, Kelly M. Y. Chan, Amey Girdhari, and Joydeep Sarkar
- Subjects
Natural language processing ,Electronic health records ,Suicidal ideation ,Anhedonia ,Major depressive disorder ,Computer applications to medicine. Medical informatics ,R858-859.7 - Abstract
Abstract Background Anhedonia and suicidal ideation are symptoms of major depressive disorder (MDD) that are not regularly captured in structured scales but may be captured in unstructured clinical notes. Natural language processing (NLP) techniques may be used to extract longitudinal data on suicidal behaviors and anhedonia within unstructured clinical notes. This study assessed the accuracy of using NLP techniques on electronic health records (EHRs) to identify these symptoms among patients with MDD. Methods EHR-derived, de-identified data were used from the NeuroBlu Database (version 23R1), a longitudinal behavioral health real-world database. Mental health clinicians annotated instances of anhedonia and suicidal symptoms in clinical notes creating a ground truth. Interrater reliability (IRR) was calculated using Krippendorff’s alpha. A novel transformer architecture-based NLP model was trained on clinical notes to recognize linguistic patterns and contextual cues. Each sentence was categorized into one of four labels: (1) anhedonia; (2) suicidal ideation without intent or plan; (3) suicidal ideation with intent or plan; (4) absence of suicidal ideation or anhedonia. The model was assessed using positive predictive values (PPV), negative predictive values, sensitivity, specificity, F1-score, and AUROC. Results The model was trained, tested, and validated on 2,198, 1,247, and 1,016 distinct clinical notes, respectively. IRR was 0.80. For anhedonia, suicidal ideation with intent or plan, and suicidal ideation without intent or plan the model achieved a PPV of 0.98, 0.93, and 0.87, an F1-score of 0.98, 0.91, and 0.89 during training and a PPV of 0.99, 0.95, and 0.87 and F1-score of 0.99, 0.95, and 0.89 during validation. Conclusions NLP techniques can leverage contextual information in EHRs to identify anhedonia and suicidal symptoms in patients with MDD. Integrating structured and unstructured data offers a comprehensive view of MDD’s trajectory, helping healthcare providers deliver timely, effective interventions. Addressing current limitations will further enhance NLP models, enabling more accurate extraction of critical clinical features and supporting personalized, proactive mental health care.
- Published
- 2025
- Full Text
- View/download PDF