1. Significant sparse polygenic risk scores across 813 traits in UK Biobank.
- Author
-
Tanigawa, Yosuke, Qian, Junyang, Venkataraman, Guhan, Justesen, Johanne Marie, Li, Ruilin, Tibshirani, Robert, Hastie, Trevor, and Rivas, Manuel A.
- Subjects
DISEASE risk factors ,MONOGENIC & polygenic inheritance (Genetics) ,GENETIC variation ,DISEASE susceptibility ,STATISTICAL significance ,PRINCIPAL components analysis ,HERITABILITY ,MULTITRAIT multimethod techniques - Abstract
We present a systematic assessment of polygenic risk score (PRS) prediction across more than 1,500 traits using genetic and phenotype data in the UK Biobank. We report 813 sparse PRS models with significant (p < 2.5 x 10
−5 ) incremental predictive performance when compared against the covariate-only model that considers age, sex, types of genotyping arrays, and the principal component loadings of genotypes. We report a significant correlation between the number of genetic variants selected in the sparse PRS model and the incremental predictive performance (Spearman's ⍴ = 0.61, p = 2.2 x 10−59 for quantitative traits, ⍴ = 0.21, p = 9.6 x 10−4 for binary traits). The sparse PRS model trained on European individuals showed limited transferability when evaluated on non-European individuals in the UK Biobank. We provide the PRS model weights on the Global Biobank Engine (https://biobankengine.stanford.edu/prs). Author summary: Polygenic risk score (PRS), an approach to estimate genetic predisposition on disease liability by aggregating the effects across multiple genetic variants, has attracted increasing research interest. While there have been improvements in the predictive performance of PRS for some traits, the applicability of PRS models across a wide range of human traits has not been clear. Here, applying penalized regression using Batch Screening Iterative Lasso (BASIL) algorithm to more than 269,000 individuals of white British ancestry in UK Biobank, we systematically characterize PRS models across more than 1,500 traits. We report 813 traits with PRS models of statistically significant predictive performance. While the statistical significance does not necessarily directly translate into clinical relevance, we investigate the properties of the 813 significant PRS models and report a significant correlation between predictive performance and estimated SNP-based heritability. We find that the number of genetic variants selected in our sparse PRS model is significantly correlated with the incremental predictive performance in both quantitative and binary traits. Our transferability assessment of PRS models in UK Biobank revealed that the sparse PRS models trained on individuals of European ancestry had a lower predictive performance for individuals of African and Asian ancestry groups. [ABSTRACT FROM AUTHOR]- Published
- 2022
- Full Text
- View/download PDF