Back to Search Start Over

Self‐supervised learning improves robustness of deep learning lung tumor segmentation models to CT imaging differences.

Authors :
Jiang, Jue
Rangnekar, Aneesh
Veeraraghavan, Harini
Source :
Medical Physics. Dec2024, p1. 16p. 8 Illustrations.
Publication Year :
2024

Abstract

Background Purpose Methods Results Conclusion Self‐supervised learning (SSL) is an approach to extract useful feature representations from unlabeled data, and enable fine‐tuning on downstream tasks with limited labeled examples. Self‐pretraining is a SSL approach that uses curated downstream task dataset for both pretraining and fine‐tuning. Availability of large, diverse, and uncurated public medical image sets presents the opportunity to potentially create foundation models by applying SSL in the “wild” that are robust to imaging variations. However, the benefit of wild‐ versus self‐pretraining has not been studied for medical image analysis.Compare robustness of wild versus self‐pretrained models created using convolutional neural network (CNN) and transformer (vision transformer [ViT] and hierarchical shifted window [Swin]) models for non‐small cell lung cancer (NSCLC) segmentation from 3D computed tomography (CT) scans.CNN, ViT, and Swin models were wild‐pretrained using unlabeled 10,412 3D CTs sourced from the cancer imaging archive and internal datasets. Self‐pretraining was applied to same networks using a curated public downstream task dataset (<italic>n</italic> = 377) of patients with NSCLC. Pretext tasks introduced in self‐distilled masked image transformer were used for both pretraining approaches. All models were fine‐tuned to segment NSCLC (<italic>n</italic> = 377 training dataset) and tested on two separate datasets containing early (public <italic>n</italic> = 156) and advanced stage (internal <italic>n</italic> = 196) NSCLC. Models were evaluated in terms of: (a) accuracy, (b) robustness to image differences from contrast, slice thickness, and reconstruction kernels, and (c) impact of pretext tasks for pretraining. Feature reuse was evaluated using centered kernel alignment.Wild‐pretrained Swin models resulted in higher feature reuse at earlier level layers and increased feature differentiation close to output. Wild‐pretrained Swin outperformed self‐pretrained models for analyzed imaging acquisitions. Neither ViT nor CNN showed a clear benefit of wild‐pretraining compared to self‐pretraining. Masked image prediction pretext task that forces networks to learn the local structure resulted in higher accuracy compared to contrastive task that models global image information.Wild‐pretrained Swin networks were more robust to analyzed CT imaging differences for lung tumor segmentation than self‐pretrained methods. ViT and CNN models did not show a clear benefit for wild‐pretraining over self‐pretraining. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00942405
Database :
Academic Search Index
Journal :
Medical Physics
Publication Type :
Academic Journal
Accession number :
181405954
Full Text :
https://doi.org/10.1002/mp.17541