1. An explainable transformer model integrating PET and tabular data for histologic grading and prognosis of follicular lymphoma: a multi-institutional digital biopsy study.
- Author
-
Jiang, Chong, Jiang, Zekun, Zhang, Zitong, Huang, Hexiao, Zhou, Hang, Jiang, Qiuhui, Teng, Yue, Li, Hai, Xu, Bing, Li, Xin, Xu, Jingyan, Ding, Chongyang, Li, Kang, and Tian, Rong
- Abstract
Background: Pathological grade is a critical determinant of clinical outcomes and decision-making of follicular lymphoma (FL). This study aimed to develop a deep learning model as a digital biopsy for the non-invasive identification of FL grade.This study retrospectively included 513 FL patients from five independent hospital centers, randomly divided into training, internal validation, and external validation cohorts. A multimodal fusion Transformer model was developed integrating 3D PET tumor images with tabular data to predict FL grade. Additionally, the model is equipped with explainable modules, including Gradient-weighted Class Activation Mapping (Grad-CAM) for PET images, SHapley Additive exPlanations analysis for tabular data, and the calculation of predictive contribution ratios for both modalities, to enhance clinical interpretability and reliability. The predictive performance was evaluated using the area under the receiver operating characteristic curve (AUC) and accuracy, and its prognostic value was also assessed.The Transformer model demonstrated high accuracy in grading FL, with AUCs of 0.964–0.985 and accuracies of 90.2-96.7% in the training cohort, and similar performance in the validation cohorts (AUCs: 0.936–0.971, accuracies: 86.4-97.0%). Ablation studies confirmed that the fusion model outperformed single-modality models (AUCs: 0.974 − 0.956, accuracies: 89.8%-85.8%). Interpretability analysis revealed that PET images contributed 81-89% of the predictive value. Grad-CAM highlighted the tumor and peri-tumor regions. The model also effectively stratified patients by survival risk (
P < 0.05), highlighting its prognostic value.Our study developed an explainable multimodal fusion Transformer model for accurate grading and prognosis of FL, with the potential to aid clinical decision-making.Methods: Pathological grade is a critical determinant of clinical outcomes and decision-making of follicular lymphoma (FL). This study aimed to develop a deep learning model as a digital biopsy for the non-invasive identification of FL grade.This study retrospectively included 513 FL patients from five independent hospital centers, randomly divided into training, internal validation, and external validation cohorts. A multimodal fusion Transformer model was developed integrating 3D PET tumor images with tabular data to predict FL grade. Additionally, the model is equipped with explainable modules, including Gradient-weighted Class Activation Mapping (Grad-CAM) for PET images, SHapley Additive exPlanations analysis for tabular data, and the calculation of predictive contribution ratios for both modalities, to enhance clinical interpretability and reliability. The predictive performance was evaluated using the area under the receiver operating characteristic curve (AUC) and accuracy, and its prognostic value was also assessed.The Transformer model demonstrated high accuracy in grading FL, with AUCs of 0.964–0.985 and accuracies of 90.2-96.7% in the training cohort, and similar performance in the validation cohorts (AUCs: 0.936–0.971, accuracies: 86.4-97.0%). Ablation studies confirmed that the fusion model outperformed single-modality models (AUCs: 0.974 − 0.956, accuracies: 89.8%-85.8%). Interpretability analysis revealed that PET images contributed 81-89% of the predictive value. Grad-CAM highlighted the tumor and peri-tumor regions. The model also effectively stratified patients by survival risk (P < 0.05), highlighting its prognostic value.Our study developed an explainable multimodal fusion Transformer model for accurate grading and prognosis of FL, with the potential to aid clinical decision-making.Results: Pathological grade is a critical determinant of clinical outcomes and decision-making of follicular lymphoma (FL). This study aimed to develop a deep learning model as a digital biopsy for the non-invasive identification of FL grade.This study retrospectively included 513 FL patients from five independent hospital centers, randomly divided into training, internal validation, and external validation cohorts. A multimodal fusion Transformer model was developed integrating 3D PET tumor images with tabular data to predict FL grade. Additionally, the model is equipped with explainable modules, including Gradient-weighted Class Activation Mapping (Grad-CAM) for PET images, SHapley Additive exPlanations analysis for tabular data, and the calculation of predictive contribution ratios for both modalities, to enhance clinical interpretability and reliability. The predictive performance was evaluated using the area under the receiver operating characteristic curve (AUC) and accuracy, and its prognostic value was also assessed.The Transformer model demonstrated high accuracy in grading FL, with AUCs of 0.964–0.985 and accuracies of 90.2-96.7% in the training cohort, and similar performance in the validation cohorts (AUCs: 0.936–0.971, accuracies: 86.4-97.0%). Ablation studies confirmed that the fusion model outperformed single-modality models (AUCs: 0.974 − 0.956, accuracies: 89.8%-85.8%). Interpretability analysis revealed that PET images contributed 81-89% of the predictive value. Grad-CAM highlighted the tumor and peri-tumor regions. The model also effectively stratified patients by survival risk (P < 0.05), highlighting its prognostic value.Our study developed an explainable multimodal fusion Transformer model for accurate grading and prognosis of FL, with the potential to aid clinical decision-making.Conclusions: Pathological grade is a critical determinant of clinical outcomes and decision-making of follicular lymphoma (FL). This study aimed to develop a deep learning model as a digital biopsy for the non-invasive identification of FL grade.This study retrospectively included 513 FL patients from five independent hospital centers, randomly divided into training, internal validation, and external validation cohorts. A multimodal fusion Transformer model was developed integrating 3D PET tumor images with tabular data to predict FL grade. Additionally, the model is equipped with explainable modules, including Gradient-weighted Class Activation Mapping (Grad-CAM) for PET images, SHapley Additive exPlanations analysis for tabular data, and the calculation of predictive contribution ratios for both modalities, to enhance clinical interpretability and reliability. The predictive performance was evaluated using the area under the receiver operating characteristic curve (AUC) and accuracy, and its prognostic value was also assessed.The Transformer model demonstrated high accuracy in grading FL, with AUCs of 0.964–0.985 and accuracies of 90.2-96.7% in the training cohort, and similar performance in the validation cohorts (AUCs: 0.936–0.971, accuracies: 86.4-97.0%). Ablation studies confirmed that the fusion model outperformed single-modality models (AUCs: 0.974 − 0.956, accuracies: 89.8%-85.8%). Interpretability analysis revealed that PET images contributed 81-89% of the predictive value. Grad-CAM highlighted the tumor and peri-tumor regions. The model also effectively stratified patients by survival risk (P < 0.05), highlighting its prognostic value.Our study developed an explainable multimodal fusion Transformer model for accurate grading and prognosis of FL, with the potential to aid clinical decision-making.Graphical Abstract: Pathological grade is a critical determinant of clinical outcomes and decision-making of follicular lymphoma (FL). This study aimed to develop a deep learning model as a digital biopsy for the non-invasive identification of FL grade.This study retrospectively included 513 FL patients from five independent hospital centers, randomly divided into training, internal validation, and external validation cohorts. A multimodal fusion Transformer model was developed integrating 3D PET tumor images with tabular data to predict FL grade. Additionally, the model is equipped with explainable modules, including Gradient-weighted Class Activation Mapping (Grad-CAM) for PET images, SHapley Additive exPlanations analysis for tabular data, and the calculation of predictive contribution ratios for both modalities, to enhance clinical interpretability and reliability. The predictive performance was evaluated using the area under the receiver operating characteristic curve (AUC) and accuracy, and its prognostic value was also assessed.The Transformer model demonstrated high accuracy in grading FL, with AUCs of 0.964–0.985 and accuracies of 90.2-96.7% in the training cohort, and similar performance in the validation cohorts (AUCs: 0.936–0.971, accuracies: 86.4-97.0%). Ablation studies confirmed that the fusion model outperformed single-modality models (AUCs: 0.974 − 0.956, accuracies: 89.8%-85.8%). Interpretability analysis revealed that PET images contributed 81-89% of the predictive value. Grad-CAM highlighted the tumor and peri-tumor regions. The model also effectively stratified patients by survival risk (P < 0.05), highlighting its prognostic value.Our study developed an explainable multimodal fusion Transformer model for accurate grading and prognosis of FL, with the potential to aid clinical decision-making. [ABSTRACT FROM AUTHOR]- Published
- 2025
- Full Text
- View/download PDF