Back to Search Start Over

Pathological changes or technical artefacts? The problem of the heterogenous databases in COVID-19 CXR image analysis.

Authors :
Socha, Marek
Prażuch, Wojciech
Suwalska, Aleksandra
Foszner, Paweł
Tobiasz, Joanna
Jaroszewicz, Jerzy
Gruszczynska, Katarzyna
Sliwinska, Magdalena
Nowak, Mateusz
Gizycka, Barbara
Zapolska, Gabriela
Popiela, Tadeusz
Przybylski, Grzegorz
Fiedor, Piotr
Pawlowska, Malgorzata
Flisiak, Robert
Simon, Krzysztof
Walecki, Jerzy
Cieszanowski, Andrzej
Szurowska, Edyta
Source :
Computer Methods & Programs in Biomedicine. Oct2023, Vol. 240, pN.PAG-N.PAG. 1p.
Publication Year :
2023

Abstract

When the COVID-19 pandemic commenced in 2020, scientists assisted medical specialists with diagnostic algorithm development. One scientific research area related to COVID-19 diagnosis was medical imaging and its potential to support molecular tests. Unfortunately, several systems reported high accuracy in development but did not fare well in clinical application. The reason was poor generalization, a long-standing issue in AI development. Researchers found many causes of this issue and decided to refer to them as confounders, meaning a set of artefacts and methodological errors associated with the method. We aim to contribute to this steed by highlighting an undiscussed confounder related to image resolution. 20 216 chest X-ray images (CXR) from worldwide centres were analyzed. The CXRs were bijectively projected into the 2D domain by performing Uniform Manifold Approximation and Projection (UMAP) embedding on the radiomic features (rUMAP) or CNN-based neural features (nUMAP) from the pre-last layer of the pre-trained classification neural network. Additional 44 339 thorax CXRs were used for validation. The comprehensive analysis of the multimodality of the density distribution in rUMAP/nUMAP domains and its relation to the original image properties was used to identify the main confounders. nUMAP revealed a hidden bias of neural networks towards the image resolution, which the regular up-sampling procedure cannot compensate for. The issue appears regardless of the network architecture and is not observed in a high-resolution dataset. The impact of the resolution heterogeneity can be partially diminished by applying advanced deep-learning-based super-resolution networks. rUMAP and nUMAP are great tools for image homogeneity analysis and bias discovery, as demonstrated by applying them to COVID-19 image data. Nonetheless, nUMAP could be applied to any type of data for which a deep neural network could be constructed. Advanced image super-resolution solutions are needed to reduce the impact of the resolution diversity on the classification network decision. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01692607
Volume :
240
Database :
Academic Search Index
Journal :
Computer Methods & Programs in Biomedicine
Publication Type :
Academic Journal
Accession number :
170720362
Full Text :
https://doi.org/10.1016/j.cmpb.2023.107684