Back to Search Start Over

Deep Learning in Brain MRI: Effect of Data Leakage Due to Slice-level Split Using 2D Convolutional Neural Networks

Deep Learning in Brain MRI: Effect of Data Leakage Due to Slice-level Split Using 2D Convolutional Neural Networks

Authors :
Chiara Marzi
Stefano Diciotti
Selamawet Workalemahu Atnafu
Carlo Tessa
Alba García Seco de Herrera
Luca Citi
Ekin Yagis
Marco Giannelli
Publication Year :
2021
Publisher :
Research Square Platform LLC, 2021.

Abstract

In recent years, 2D convolutional neural networks (CNNs) have been extensively used for the diagnosis of neurological diseases from magnetic resonance imaging (MRI) data due to their potential to discern subtle and intricate patterns. Despite the high performances reported in numerous studies, developing CNN models with good generalization abilities is still a challenging task due to possible data leakage introduced during cross-validation (CV). In this study, we quantitatively assessed the effect of a data leakage caused by 3D MRI data splitting based on a 2D slice-level using three 2D CNN models for the classification of patients with Alzheimer’s disease (AD) and Parkinson’s disease (PD). Our experiments showed that slice-level CV erroneously boosted the average slice level accuracy on the test set by 30% on Open Access Series of Imaging Studies (OASIS), 29% on Alzheimer’s Disease Neuroimaging Initiative (ADNI), 48% on Parkinson's Progression Markers Initiative (PPMI) and 55% on a local de-novo PD Versilia dataset. Further tests on a randomly labeled OASIS-derived dataset produced about 96% of (erroneous) accuracy (slice-level split) and 50% accuracy (subject-level split), as expected from a randomized experiment. Overall, the extent of the effect of an erroneous slice-based CV is severe, especially for small datasets.

Details

Database :
OpenAIRE
Accession number :
edsair.doi...........dc4e6c563f2b65ab385e28111786e7aa
Full Text :
https://doi.org/10.21203/rs.3.rs-464091/v1