Back to Search Start Over

FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings

Authors :
Terrail, Jean Ogier du
Ayed, Samy-Safwan
Cyffers, Edwige
Grimberg, Felix
He, Chaoyang
Loeb, Regis
Mangold, Paul
Marchand, Tanguy
Marfoq, Othmane
Mushtaq, Erum
Muzellec, Boris
Philippenko, Constantin
Silva, Santiago
Teleńczuk, Maria
Albarqouni, Shadi
Avestimehr, Salman
Bellet, Aurélien
Dieuleveut, Aymeric
Jaggi, Martin
Karimireddy, Sai Praneeth
Lorenzi, Marco
Neglia, Giovanni
Tommasi, Marc
Andreux, Mathieu
Publication Year :
2022

Abstract

Federated Learning (FL) is a novel approach enabling several clients holding sensitive data to collaboratively train machine learning models, without centralizing data. The cross-silo FL setting corresponds to the case of few ($2$--$50$) reliable clients, each holding medium to large datasets, and is typically found in applications such as healthcare, finance, or industry. While previous works have proposed representative datasets for cross-device FL, few realistic healthcare cross-silo FL datasets exist, thereby slowing algorithmic research in this critical application. In this work, we propose a novel cross-silo dataset suite focused on healthcare, FLamby (Federated Learning AMple Benchmark of Your cross-silo strategies), to bridge the gap between theory and practice of cross-silo FL. FLamby encompasses 7 healthcare datasets with natural splits, covering multiple tasks, modalities, and data volumes, each accompanied with baseline training code. As an illustration, we additionally benchmark standard FL algorithms on all datasets. Our flexible and modular suite allows researchers to easily download datasets, reproduce results and re-use the different components for their research. FLamby is available at~\url{www.github.com/owkin/flamby}.<br />Comment: Accepted to NeurIPS, Datasets and Benchmarks Track, this version fixes typos in the datasets' table and the appendix

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2210.04620
Document Type :
Working Paper