Back to Search Start Over

Optimizing clinico-genomic disease prediction across ancestries: a machine learning strategy with Pareto improvement.

Authors :
Gao, Yan
Cui, Yan
Source :
Genome Medicine; 6/4/2024, Vol. 16 Issue 1, p1-15, 15p
Publication Year :
2024

Abstract

Background: Accurate prediction of an individual's predisposition to diseases is vital for preventive medicine and early intervention. Various statistical and machine learning models have been developed for disease prediction using clinico-genomic data. However, the accuracy of clinico-genomic prediction of diseases may vary significantly across ancestry groups due to their unequal representation in clinical genomic datasets. Methods: We introduced a deep transfer learning approach to improve the performance of clinico-genomic prediction models for data-disadvantaged ancestry groups. We conducted machine learning experiments on multi-ancestral genomic datasets of lung cancer, prostate cancer, and Alzheimer's disease, as well as on synthetic datasets with built-in data inequality and distribution shifts across ancestry groups. Results: Deep transfer learning significantly improved disease prediction accuracy for data-disadvantaged populations in our multi-ancestral machine learning experiments. In contrast, transfer learning based on linear frameworks did not achieve comparable improvements for these data-disadvantaged populations. Conclusions: This study shows that deep transfer learning can enhance fairness in multi-ancestral machine learning by improving prediction accuracy for data-disadvantaged populations without compromising prediction accuracy for other populations, thus providing a Pareto improvement towards equitable clinico-genomic prediction of diseases. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
1756994X
Volume :
16
Issue :
1
Database :
Complementary Index
Journal :
Genome Medicine
Publication Type :
Academic Journal
Accession number :
177674525
Full Text :
https://doi.org/10.1186/s13073-024-01345-0