Back to Search Start Over

Deep learning for automatic segmentation of vestibular schwannoma: a retrospective study from multi-center routine MRI.

Authors :
Kujawa, Aaron
Dorent, Reuben
Connor, Steve
Thomson, Suki
Ivory, Marina
Vahedi, Ali
Guilhem, Emily
Wijethilake, Navodini
Bradford, Robert
Kitchen, Neil
Bisdas, Sotirios
Ourselin, Sebastien
Vercauteren, Tom
Shapey, Jonathan
Source :
Frontiers in Computational Neuroscience; 2024, p1-15, 15p
Publication Year :
2024

Abstract

Automatic segmentation of vestibular schwannoma (VS) from routine clinical MRI has potential to improve clinical workflow, facilitate treatment decisions, and assist patient management. Previous work demonstrated reliable automatic segmentation performance on datasets of standardized MRI images acquired for stereotactic surgery planning. However, diagnostic clinical datasets are generally more diverse and pose a larger challenge to automatic segmentation algorithms, especially when post-operative images are included. In this work, we show for the first time that automatic segmentation of VS on routine MRI datasets is also possible with high accuracy. We acquired and publicly release a curated multi-center routine clinical (MC-RC) dataset of 160 patients with a single sporadic VS. For each patient up to three longitudinal MRI exams with contrast-enhanced T1-weighted (ceT1w) (n = 124) and T2-weighted (T2w) (n = 363) images were included and the VS manually annotated. Segmentations were produced and verified in an iterative process: (1) initial segmentations by a specialized company; (2) review by one of three trained radiologists; and (3) validation by an expert team. Inter- and intra-observer reliability experiments were performed on a subset of the dataset. A state-of-the-art deep learning framework was used to train segmentation models for VS. Model performance was evaluated on a MC-RC hold-out testing set, another public VS datasets, and a partially public dataset. The generalizability and robustness of the VS deep learning segmentation models increased significantly when trained on the MC-RC dataset. Dice similarity coefficients (DSC) achieved by our model are comparable to those achieved by trained radiologists in the inter-observer experiment. On the MC-RC testing set, median DSCs were 86.2(9.5) for ceT1w, 89.4(7.0) for T2w, and 86.4(8.6) for combined ceT1w+T2w input images. On another public dataset acquired for Gamma Knife stereotactic radiosurgery our model achieved median DSCs of 95.3(2.9), 92.8(3.8), and 95.5(3.3), respectively. In contrast, models trained on the Gamma Knife dataset did not generalize well as illustrated by significant underperformance on theMC-RC routineMRI dataset, highlighting the importance of data variability in the development of robust VS segmentation models. The MC-RC dataset and all trained deep learning models were made available online. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
16625188
Database :
Complementary Index
Journal :
Frontiers in Computational Neuroscience
Publication Type :
Academic Journal
Accession number :
177449688
Full Text :
https://doi.org/10.3389/fncom.2024.1365727