Back to Search Start Over

Explainable machine learning predicts survival of retroperitoneal liposarcoma: A study based on the SEER database and external validation in China.

Authors :
Wang, Maoyu
Li, Zhizhou
Zeng, Shuxiong
Wang, Ziwei
Ying, Yidie
He, Wei
Zhang, Zhensheng
Wang, Huiqing
Xu, Chuanliang
Source :
Cancer Medicine. Jun2024, Vol. 13 Issue 11, p1-18. 18p.
Publication Year :
2024

Abstract

Objective: We have developed explainable machine learning models to predict the overall survival (OS) of retroperitoneal liposarcoma (RLPS) patients. This approach aims to enhance the explainability and transparency of our modeling results. Methods: We collected clinicopathological information of RLPS patients from The Surveillance, Epidemiology, and End Results (SEER) database and allocated them into training and validation sets with a 7:3 ratio. Simultaneously, we obtained an external validation cohort from The First Affiliated Hospital of Naval Medical University (Shanghai, China). We performed LASSO regression and multivariate Cox proportional hazards analysis to identify relevant risk factors, which were then combined to develop six machine learning (ML) models: Cox proportional hazards model (Coxph), random survival forest (RSF), ranger, gradient boosting with component‐wise linear models (GBM), decision trees, and boosting trees. The predictive performance of these ML models was evaluated using the concordance index (C‐index), the integrated cumulative/dynamic area under the curve (AUC), and the integrated Brier score, as well as the Cox–Snell residual plot. We also used time‐dependent variable importance, analysis of partial dependence survival plots, and the generation of aggregated survival SHapley Additive exPlanations (SurvSHAP) plots to provide a global explanation of the optimal model. Additionally, SurvSHAP (t) and survival local interpretable model‐agnostic explanations (SurvLIME) plots were used to provide a local explanation of the optimal model. Results: The final ML models are consisted of six factors: patient's age, gender, marital status, surgical history, as well as tumor's histopathological classification, histological grade, and SEER stage. Our prognostic model exhibits significant discriminative ability, particularly with the ranger model performing optimally. In the training set, validation set, and external validation set, the AUC for 1, 3, and 5 year OS are all above 0.83, and the integrated Brier scores are consistently below 0.15. The explainability analysis of the ranger model also indicates that histological grade, histopathological classification, and age are the most influential factors in predicting OS. Conclusions: The ranger ML prognostic model exhibits optimal performance and can be utilized to predict the OS of RLPS patients, offering valuable and crucial references for clinical physicians to make informed decisions in advance. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
20457634
Volume :
13
Issue :
11
Database :
Academic Search Index
Journal :
Cancer Medicine
Publication Type :
Academic Journal
Accession number :
177929491
Full Text :
https://doi.org/10.1002/cam4.7324