Back to Search Start Over

Development of an AI system for accurately diagnose hepatocellular carcinoma from computed tomography imaging data.

Authors :
Wang M
Fu F
Zheng B
Bai Y
Wu Q
Wu J
Sun L
Liu Q
Liu M
Yang Y
Shen H
Kong D
Ma X
You P
Li X
Tian F
Source :
British journal of cancer [Br J Cancer] 2021 Oct; Vol. 125 (8), pp. 1111-1121. Date of Electronic Publication: 2021 Aug 07.
Publication Year :
2021

Abstract

Background and Aims: Computed tomography (CT) scan is frequently used to detect hepatocellular carcinoma (HCC) in routine clinical practice. The aim of this study is to develop a deep-learning AI system to improve the diagnostic accuracy of HCC by analysing liver CT imaging data.<br />Methods: We developed a deep-learning AI system by training on CT images from 7512 patients at Henan Provincial Peoples' Hospital. Its performance was validated on one internal test set (Henan Provincial Peoples' Hospital, n = 385) and one external test set (Henan Provincial Cancer Hospital, n = 556). The area under the receiver-operating characteristic curve (AUROC) was used as the primary classification metric. Accuracy, sensitivity, specificity, precision, negative predictive value and F1 metric were used to measure the performance of AI systems and radiologists.<br />Results: AI system achieved high performance in identifying HCC patients, with AUROC of 0.887 (95% CI 0.855-0.919) on the internal test set and 0.883 (95% CI 0.855-0.911) on the external test set. For internal test set, accuracy was 81.0% (76.8-84.8%), sensitivity was 78.4% (72.4-83.7%), specificity was 84.4% (78.0-89.6%) and F1 (harmonic average of precision and recall rate) was 0.824. For external test set, accuracy was 81.3% (77.8-84.5%), sensitivity was 89.4% (85.0-92.8%), specificity was 74.0% (68.5-78.9%) and F1 was 0.819. Compared with radiologists, AI system achieved comparable accuracy and F1 metric on internal test set (0.853 versus 0.818, P = 0.107; 0.863 vs. 0.824, P = 0.082) and external test set (0.805 vs. 0.793, P = 0.663; 0.810 vs. 0.814, P = 0.866). The predicted HCC risk scores by AI system in HCC patients with multiple tumours and high fibrosis stage were higher than those with solitary tumour and low fibrosis stage (tumour number: 0.197 vs. 0.138, P = 0.006; fibrosis stage: 0.183 vs. 0.127, P < 0.001). Radiologists' review showed that the accuracy of saliency heatmaps predicted by algorithms was 92.1% (95% CI: 89.2-95.0%).<br />Conclusions: AI system achieved high performance in the detection of HCC compared with a group of specialised radiologists. Further investigation by prospective clinical trials was necessitated to verify this model.<br /> (© 2021. The Author(s), under exclusive licence to Springer Nature Limited.)

Details

Language :
English
ISSN :
1532-1827
Volume :
125
Issue :
8
Database :
MEDLINE
Journal :
British journal of cancer
Publication Type :
Academic Journal
Accession number :
34365472
Full Text :
https://doi.org/10.1038/s41416-021-01511-w