1. Machine learning does not outperform traditional statistical modelling for kidney allograft failure prediction
- Author
-
Agathe Truchot, Marc Raynaud, Nassim Kamar, Maarten Naesens, Christophe Legendre, Michel Delahousse, Olivier Thaunat, Matthias Buchler, Marta Crespo, Kamilla Linhares, Babak J. Orandi, Enver Akalin, Gervacio Soler Pujol, Helio Tedesco Silva, Gaurav Gupta, Dorry L. Segev, Xavier Jouven, Andrew J. Bentall, Mark D. Stegall, Carmen Lefaucheur, Olivier Aubert, and Alexandre Loupy
- Subjects
Nephrology - Abstract
Machine learning (ML) models have recently shown potential for predicting kidney allograft outcomes. However, their ability to outperform traditional approaches remains poorly investigated. Therefore, using large cohorts of kidney transplant recipients from 14 centers worldwide, we developed ML-based prediction models for kidney allograft survival and compared their prediction performances to those achieved by a validated Cox-Based Prognostication System (CBPS). In a French derivation cohort of 4000 patients, candidate determinants of allograft failure including donor, recipient and transplant-related parameters were used as predictors to develop tree-based models (RSF, RSF-ERT, CIF), Support Vector Machine models (LK-SVM, AK-SVM) and a gradient boosting model (XGBoost). Models were externally validated with cohorts of 2214 patients from Europe, 1537 from North America, and 671 from South America. Among these 8422 kidney transplant recipients, 1081 (12.84%) lost their grafts after a median post-transplant follow-up time of 6.25 years (Inter Quartile Range 4.33-8.73). At seven years post-risk evaluation, the ML models achieved a C-index of 0.788 (95% bootstrap percentile confidence interval 0.736-0.833), 0.779 (0.724-0.825), 0.786 (0.735-0.832), 0.527 (0.456-0.602), 0.704 (0.648-0.759) and 0.767 (0.711-0.815) for RSF, RSF-ERT, CIF, LK-SVM, AK-SVM and XGBoost respectively, compared with 0.808 (0.792-0.829) for the CBPS. In validation cohorts, ML models' discrimination performances were in a similar range of those of the CBPS. Calibrations of the ML models were similar or less accurate than those of the CBPS. Thus, when using a transparent methodological pipeline in validated international cohorts, ML models, despite overall good performances, do not outperform a traditional CBPS in predicting kidney allograft failure. Hence, our current study supports the continued use of traditional statistical approaches for kidney graft prognostication.
- Published
- 2023