Back to Search Start Over

Trustworthy deep learning framework for the detection of abnormalities in X-ray shoulder images.

Authors :
Laith Alzubaidi
Asma Salhi
Mohammed A Fadhel
Jinshuai Bai
Freek Hollman
Kristine Italia
Roberto Pareyon
A S Albahri
Chun Ouyang
Jose SantamarĂ­a
Kenneth Cutbush
Ashish Gupta
Amin Abbosh
Yuantong Gu
Source :
PLoS ONE, Vol 19, Iss 3, p e0299545 (2024)
Publication Year :
2024
Publisher :
Public Library of Science (PLoS), 2024.

Abstract

Musculoskeletal conditions affect an estimated 1.7 billion people worldwide, causing intense pain and disability. These conditions lead to 30 million emergency room visits yearly, and the numbers are only increasing. However, diagnosing musculoskeletal issues can be challenging, especially in emergencies where quick decisions are necessary. Deep learning (DL) has shown promise in various medical applications. However, previous methods had poor performance and a lack of transparency in detecting shoulder abnormalities on X-ray images due to a lack of training data and better representation of features. This often resulted in overfitting, poor generalisation, and potential bias in decision-making. To address these issues, a new trustworthy DL framework has been proposed to detect shoulder abnormalities (such as fractures, deformities, and arthritis) using X-ray images. The framework consists of two parts: same-domain transfer learning (TL) to mitigate imageNet mismatch and feature fusion to reduce error rates and improve trust in the final result. Same-domain TL involves training pre-trained models on a large number of labelled X-ray images from various body parts and fine-tuning them on the target dataset of shoulder X-ray images. Feature fusion combines the extracted features with seven DL models to train several ML classifiers. The proposed framework achieved an excellent accuracy rate of 99.2%, F1Score of 99.2%, and Cohen's kappa of 98.5%. Furthermore, the accuracy of the results was validated using three visualisation tools, including gradient-based class activation heat map (Grad CAM), activation visualisation, and locally interpretable model-independent explanations (LIME). The proposed framework outperformed previous DL methods and three orthopaedic surgeons invited to classify the test set, who obtained an average accuracy of 79.1%. The proposed framework has proven effective and robust, improving generalisation and increasing trust in the final results.

Subjects

Subjects :
Medicine
Science

Details

Language :
English
ISSN :
19326203
Volume :
19
Issue :
3
Database :
Directory of Open Access Journals
Journal :
PLoS ONE
Publication Type :
Academic Journal
Accession number :
edsdoj.b97977df22a041849ce5fc200804f37f
Document Type :
article
Full Text :
https://doi.org/10.1371/journal.pone.0299545&type=printable