1. Fair evaluation of federated learning algorithms for automated breast density classification: The results of the 2022 ACR-NCI-NVIDIA federated learning challenge
- Author
-
Schmidt, Kendall, Bearce, Benjamin, Chang, Ken, Coombs, Laura, Farahani, Keyvan, Elbatel, Marawan, Mouheb, Kaouther, Marti, Robert, Zhang, Ruipeng, Zhang, Yao, Wang, Yanfeng, Hu, Yaojun, Ying, Haochao, Xu, Yuyang, Testagrose, Conrad, Demirer, Mutlu, Gupta, Vikash, Akünal, Ünal, Bujotzek, Markus, H. Maier-Hein, Klaus, Qin, Yi, Li, Xiaomeng, Kalpathy-Cramer, Jayashree, R. Roth, Holger, Schmidt, Kendall, Bearce, Benjamin, Chang, Ken, Coombs, Laura, Farahani, Keyvan, Elbatel, Marawan, Mouheb, Kaouther, Marti, Robert, Zhang, Ruipeng, Zhang, Yao, Wang, Yanfeng, Hu, Yaojun, Ying, Haochao, Xu, Yuyang, Testagrose, Conrad, Demirer, Mutlu, Gupta, Vikash, Akünal, Ünal, Bujotzek, Markus, H. Maier-Hein, Klaus, Qin, Yi, Li, Xiaomeng, Kalpathy-Cramer, Jayashree, and R. Roth, Holger
- Abstract
The correct interpretation of breast density is important in the assessment of breast cancer risk. AI has been shown capable of accurately predicting breast density, however, due to the differences in imaging characteristics across mammography systems, models built using data from one system do not generalize well to other systems. Though federated learning (FL) has emerged as a way to improve the generalizability of AI without the need to share data, the best way to preserve features from all training data during FL is an active area of research. To explore FL methodology, the breast density classification FL challenge was hosted in partnership with the American College of Radiology, Harvard Medical Schools’ Mass General Brigham, University of Colorado, NVIDIA, and the National Institutes of Health National Cancer Institute. Challenge participants were able to submit docker containers capable of implementing FL on three simulated medical facilities, each containing a unique large mammography dataset. The breast density FL challenge ran from June 15 to September 5, 2022, attracting seven finalists from around the world. The winning FL submission reached a linear kappa score of 0.653 on the challenge test data and 0.413 on an external testing dataset, scoring comparably to a model trained on the same data in a central location.
- Published
- 2024