Spyretos, Christoforos, Tampu, Iulian Emil, Eklund, Anders, Haj-Hosseini, Neda, Spyretos, Christoforos, Tampu, Iulian Emil, Eklund, Anders, and Haj-Hosseini, Neda
Deep learning models have achieved prominent performance in digital pathology, with the potential to provide healthcare professionals with accurate decision-making assistance in their workflow. In this study, ViT and CNN models were implemented and compared for patch-level classification of four major glioblastoma tissue structures in histology images. A subset of the IvyGAP dataset (41 subjects, 123 images) was used, stain-normalised and patches of size 256x256 pixels were extracted. A per-subject split approach was applied to obtain training, validation and testing sets. Three models were implemented, a ViT and a CNN trained from scratch, and a ViT pre-trained on a different brain tumour histology dataset. The models' performance was assessed using a range of metrics, including accuracy and Matthew's correlation coefficient (MCC). In addition, calibration experiments were conducted and evaluated to align the models with the ground truth, utilising the temperature scaling technique. The models' uncertainty was estimated using the Monte Carlo dropout method. Lastly, the models were compared using the Wilcoxon signed-rank statistical significance test with Bonferroni correction. Among the models, the scratch-trained ViT obtained the highest test accuracy of 67% and an MCC of 0.45. The scratch-trained CNN reached a test accuracy of 49% and an MCC of 0.15, and the pre-trained ViT only achieved a test accuracy of 28% and an MCC of 0.034. Comparing the reliability graphs and metrics before and after applying temperature scaling, the subsequent experiments proceeded with the uncalibrated ViTs and the calibrated CNN. The calibrated CNN demonstrated moderate to high uncertainty across classes, and the ViTs had an overall high uncertainty. Statistically, there was no difference among the models at a significance level of 0.017. In conclusion, the scratch-trained ViT model considerably outperformed the scratch-trained CNN and the pre-trained ViT in classification. However, t