1. Machine learning for early detection of sepsis: an internal and temporal validation study
- Author
-
Armando Bedoya, Marshall Nichols, Kristin M. Corey, Anthony Lin, Meredith E. Clement, Katherine Heller, Suresh Balu, Morgan G. Simons, Joseph Futoma, Michael Gao, Nathan Brajer, Cara O'Brien, and Mark Sendak
- Subjects
electronic health records/statistics and numerical data ,AcademicSubjects/SCI01060 ,hospitalization/statistics and numerical data ,Vital signs ,Health Informatics ,support systems ,Research and Applications ,decision ,Machine learning ,computer.software_genre ,Logistic regression ,01 natural sciences ,clinical ,Sepsis ,03 medical and health sciences ,0302 clinical medicine ,sepsis/mortality ,hospital/statistics and numerical data ,Medicine ,030212 general & internal medicine ,0101 mathematics ,emergency service ,business.industry ,Proportional hazards model ,adult ,Deep learning ,010102 general mathematics ,medicine.disease ,Early warning score ,ROC curve ,Random forest ,Systemic inflammatory response syndrome ,retrospective studies ,machine learning ,Artificial intelligence ,AcademicSubjects/SCI01530 ,AcademicSubjects/MED00010 ,business ,computer - Abstract
ObjectiveDetermine if deep learning detects sepsis earlier and more accurately than other models. To evaluate model performance using implementation-oriented metrics that simulate clinical practice.Materials and MethodsWe trained internally and temporally validated a deep learning model (multi-output Gaussian process and recurrent neural network [MGP–RNN]) to detect sepsis using encounters from adult hospitalized patients at a large tertiary academic center. Sepsis was defined as the presence of 2 or more systemic inflammatory response syndrome (SIRS) criteria, a blood culture order, and at least one element of end-organ failure. The training dataset included demographics, comorbidities, vital signs, medication administrations, and labs from October 1, 2014 to December 1, 2015, while the temporal validation dataset was from March 1, 2018 to August 31, 2018. Comparisons were made to 3 machine learning methods, random forest (RF), Cox regression (CR), and penalized logistic regression (PLR), and 3 clinical scores used to detect sepsis, SIRS, quick Sequential Organ Failure Assessment (qSOFA), and National Early Warning Score (NEWS). Traditional discrimination statistics such as the C-statistic as well as metrics aligned with operational implementation were assessed.ResultsThe training set and internal validation included 42 979 encounters, while the temporal validation set included 39 786 encounters. The C-statistic for predicting sepsis within 4 h of onset was 0.88 for the MGP–RNN compared to 0.836 for RF, 0.849 for CR, 0.822 for PLR, 0.756 for SIRS, 0.619 for NEWS, and 0.481 for qSOFA. MGP–RNN detected sepsis a median of 5 h in advance. Temporal validation assessment continued to show the MGP–RNN outperform all 7 clinical risk score and machine learning comparisons.ConclusionsWe developed and validated a novel deep learning model to detect sepsis. Using our data elements and feature set, our modeling approach outperformed other machine learning methods and clinical scores.
- Published
- 2020