1. Determining breast cancer biomarker status and associated morphological features using deep learning
- Author
-
Lily Peng, Carrie L. Robinson, Emad A. Rakha, Paul Gamble, Peter Regitnig, Melissa Moran, Yun Liu, Hongwu Wang, Michael S. Toss, Greg S. Corrado, Niels Olson, Craig H. Mermel, Trissia Brown, James H. Wren, Po-Hsuan Cameron Chen, Fraser Tan, Isabelle Flament-Auvigne, David F. Steiner, Ronnachai Jaroensri, and David J. Dabbs
- Subjects
0301 basic medicine ,Oncology ,medicine.medical_specialty ,Receiver operating characteristic ,business.industry ,Deep learning ,Estrogen receptor ,Context (language use) ,medicine.disease ,03 medical and health sciences ,030104 developmental biology ,0302 clinical medicine ,Breast cancer ,030220 oncology & carcinogenesis ,Internal medicine ,Progesterone receptor ,Medicine ,Biomarker (medicine) ,Artificial intelligence ,business ,Interpretability - Abstract
Breast cancer management depends on biomarkers including estrogen receptor, progesterone receptor, and human epidermal growth factor receptor 2 (ER/PR/HER2). Though existing scoring systems are widely used and well-validated, they can involve costly preparation and variable interpretation. Additionally, discordances between histology and expected biomarker findings can prompt repeat testing to address biological, interpretative, or technical reasons for unexpected results. We developed three independent deep learning systems (DLS) to directly predict ER/PR/HER2 status for both focal tissue regions (patches) and slides using hematoxylin-and-eosin-stained (H&E) images as input. Models were trained and evaluated using pathologist annotated slides from three data sources. Areas under the receiver operator characteristic curve (AUCs) were calculated for test sets at both a patch-level (>135 million patches, 181 slides) and slide-level (n = 3274 slides, 1249 cases, 37 sites). Interpretability analyses were performed using Testing with Concept Activation Vectors (TCAV), saliency analysis, and pathologist review of clustered patches. The patch-level AUCs are 0.939 (95%CI 0.936–0.941), 0.938 (0.936–0.940), and 0.808 (0.802–0.813) for ER/PR/HER2, respectively. At the slide level, AUCs are 0.86 (95%CI 0.84–0.87), 0.75 (0.73–0.77), and 0.60 (0.56–0.64) for ER/PR/HER2, respectively. Interpretability analyses show known biomarker-histomorphology associations including associations of low-grade and lobular histology with ER/PR positivity, and increased inflammatory infiltrates with triple-negative staining. This study presents rapid breast cancer biomarker estimation from routine H&E slides and builds on prior advances by prioritizing interpretability of computationally learned features in the context of existing pathological knowledge. Breast cancer diagnosis and characterization involves evaluation of marker proteins found inside or on the surface of tumor cells. Three of the most important markers are estrogen receptor (ER), progesterone receptor (PR) and a receptor called HER2. The levels of these markers can influence how a person with breast cancer is treated in the clinic. This study explored the ability of machine learning – whereby computer software is trained to recognise and classify particular image features - to determine the status of these markers in digitized images, without the need for tissue stains. Our results demonstrate that machine learning can automatically predict the status of ER, PR and HER2 in pathology images and further testing identifies specific image features which enable these predictions. This type of approach may decrease costs and timelines and enable improved quality control in marker detection. Gamble and Jaroensri et al. develop deep learning systems to predict breast cancer biomarker status using H&E images. Their models enable slide-level and patch-level predictions for ER, PR and HER2, with interpretability analyses highlighting specific histological features associated with these markers.
- Published
- 2021