1. Development of interpretable machine learning models to predict in-hospital prognosis of acute heart failure patients.
- Author
-
Tanaka M, Kohjitani H, Yamamoto E, Morimoto T, Kato T, Yaku H, Inuzuka Y, Tamaki Y, Ozasa N, Seko Y, Shiba M, Yoshikawa Y, Yamashita Y, Kitai T, Taniguchi R, Iguchi M, Nagao K, Kawai T, Komasa A, Kawase Y, Morinaga T, Toyofuku M, Furukawa Y, Ando K, Kadota K, Sato Y, Kuwahara K, Okuno Y, Kimura T, and Ono K
- Subjects
- Humans, Female, Male, Prognosis, Acute Disease, Aged, Risk Assessment methods, Registries, Aged, 80 and over, Japan epidemiology, ROC Curve, Risk Factors, Heart Failure mortality, Machine Learning, Hospital Mortality trends
- Abstract
Aims: In recent years, there has been remarkable development in machine learning (ML) models, showing a trend towards high prediction performance. ML models with high prediction performance often become structurally complex and are frequently perceived as black boxes, hindering intuitive interpretation of the prediction results. We aimed to develop ML models with high prediction performance, interpretability, and superior risk stratification to predict in-hospital mortality and worsening heart failure (WHF) in patients with acute heart failure (AHF)., Methods and Results: Based on the Kyoto Congestive Heart Failure registry, which enrolled 4056 patients with AHF, we developed prediction models for in-hospital mortality and WHF using information obtained on the first day of admission (demographics, physical examination, blood test results, etc.). After excluding 16 patients who died on the first or second day of admission, the original dataset (n = 4040) was split 4:1 into training (n = 3232) and test datasets (n = 808). Based on the training dataset, we developed three types of prediction models: (i) the classification and regression trees (CART) model; (ii) the random forest (RF) model; and (iii) the extreme gradient boosting (XGBoost) model. The performance of each model was evaluated using the test dataset, based on metrics including sensitivity, specificity, area under the receiver operating characteristic curve (AUC), Brier score, and calibration slope. For the complex structure of the XGBoost model, we performed SHapley Additive exPlanations (SHAP) analysis, classifying patients into interpretable clusters. In the original dataset, the proportion of females was 44.8% (1809/4040), and the average age was 77.9 ± 12.0. The in-hospital mortality rate was 6.3% (255/4040) and the WHF rate was 22.3% (900/4040) in the total study population. In the in-hospital mortality prediction, the AUC for the XGBoost model was 0.816 [95% confidence interval (CI): 0.815-0.818], surpassing the AUC values for the CART model (0.683, 95% CI: 0.680-0.685) and the RF model (0.755, 95% CI: 0.753-0.757). Similarly, in the WHF prediction, the AUC for the XGBoost model was 0.766 (95% CI: 0.765-0.768), outperforming the AUC values for the CART model (0.688, 95% CI: 0.686-0.689) and the RF model (0.713, 95% CI: 0.711-0.714). In the XGBoost model, interpretable clusters were formed, and the rates of in-hospital mortality and WHF were similar among each cluster in both the training and test datasets., Conclusions: The XGBoost models with SHAP analysis provide high prediction performance, interpretability, and reproducible risk stratification for in-hospital mortality and WHF for patients with AHF., (© 2024 The Authors. ESC Heart Failure published by John Wiley & Sons Ltd on behalf of European Society of Cardiology.)
- Published
- 2024
- Full Text
- View/download PDF