Back to Search
Start Over
Explainable machine learning predicts survival of retroperitoneal liposarcoma: A study based on the SEER database and external validation in China.
- Source :
-
Cancer Medicine . Jun2024, Vol. 13 Issue 11, p1-18. 18p. - Publication Year :
- 2024
-
Abstract
- Objective: We have developed explainable machine learning models to predict the overall survival (OS) of retroperitoneal liposarcoma (RLPS) patients. This approach aims to enhance the explainability and transparency of our modeling results. Methods: We collected clinicopathological information of RLPS patients from The Surveillance, Epidemiology, and End Results (SEER) database and allocated them into training and validation sets with a 7:3 ratio. Simultaneously, we obtained an external validation cohort from The First Affiliated Hospital of Naval Medical University (Shanghai, China). We performed LASSO regression and multivariate Cox proportional hazards analysis to identify relevant risk factors, which were then combined to develop six machine learning (ML) models: Cox proportional hazards model (Coxph), random survival forest (RSF), ranger, gradient boosting with component‐wise linear models (GBM), decision trees, and boosting trees. The predictive performance of these ML models was evaluated using the concordance index (C‐index), the integrated cumulative/dynamic area under the curve (AUC), and the integrated Brier score, as well as the Cox–Snell residual plot. We also used time‐dependent variable importance, analysis of partial dependence survival plots, and the generation of aggregated survival SHapley Additive exPlanations (SurvSHAP) plots to provide a global explanation of the optimal model. Additionally, SurvSHAP (t) and survival local interpretable model‐agnostic explanations (SurvLIME) plots were used to provide a local explanation of the optimal model. Results: The final ML models are consisted of six factors: patient's age, gender, marital status, surgical history, as well as tumor's histopathological classification, histological grade, and SEER stage. Our prognostic model exhibits significant discriminative ability, particularly with the ranger model performing optimally. In the training set, validation set, and external validation set, the AUC for 1, 3, and 5 year OS are all above 0.83, and the integrated Brier scores are consistently below 0.15. The explainability analysis of the ranger model also indicates that histological grade, histopathological classification, and age are the most influential factors in predicting OS. Conclusions: The ranger ML prognostic model exhibits optimal performance and can be utilized to predict the OS of RLPS patients, offering valuable and crucial references for clinical physicians to make informed decisions in advance. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 20457634
- Volume :
- 13
- Issue :
- 11
- Database :
- Academic Search Index
- Journal :
- Cancer Medicine
- Publication Type :
- Academic Journal
- Accession number :
- 177929491
- Full Text :
- https://doi.org/10.1002/cam4.7324