1. An atlas of genetic scores to predict multi-omic traits
- Author
-
Yu Xu, Scott C. Ritchie, Yujian Liang, Paul R. H. J. Timmers, Maik Pietzner, Loïc Lannelongue, Samuel A. Lambert, Usman A. Tahir, Sebastian May-Wilson, Carles Foguet, Åsa Johansson, Praveen Surendran, Artika P. Nath, Elodie Persyn, James E. Peters, Clare Oliver-Williams, Shuliang Deng, Bram Prins, Jian’an Luan, Lorenzo Bomba, Nicole Soranzo, Emanuele Di Angelantonio, Nicola Pirastu, E. Shyong Tai, Rob M. van Dam, Helen Parkinson, Emma E. Davenport, Dirk S. Paul, Christopher Yau, Robert E. Gerszten, Anders Mälarstig, John Danesh, Xueling Sim, Claudia Langenberg, James F. Wilson, Adam S. Butterworth, Michael Inouye, Xu, Yu [0000-0002-7304-5045], Ritchie, Scott C [0000-0002-8454-9548], Timmers, Paul RHJ [0000-0002-5197-1267], Pietzner, Maik [0000-0003-3437-9963], Lannelongue, Loïc [0000-0002-9135-1345], Lambert, Samuel A [0000-0001-8222-008X], May-Wilson, Sebastian [0000-0003-2668-5717], Foguet, Carles [0000-0001-8494-9595], Johansson, Åsa [0000-0002-2915-4498], Persyn, Elodie [0000-0001-9104-1972], Peters, James E [0000-0002-9415-3440], Prins, Bram [0000-0001-5774-034X], Luan, Jian'an [0000-0003-3137-6337], Soranzo, Nicole [0000-0003-1095-3852], Tai, E Shyong [0000-0003-2929-8966], Paul, Dirk S [0000-0002-8230-0116], Yau, Christopher [0000-0001-7615-8523], Gerszten, Robert E [0000-0002-6767-7687], Mälarstig, Anders [0000-0003-2608-1358], Sim, Xueling [0000-0002-1233-7642], Langenberg, Claudia [0000-0002-5017-7344], Wilson, James F [0000-0001-5751-9178], Butterworth, Adam S [0000-0002-6915-9015], Inouye, Michael [0000-0001-9413-6520], and Apollo - University of Cambridge Repository
- Subjects
Proteomics ,European People ,Internet ,Multidisciplinary ,Asian ,Proteome ,Databases, Factual ,Datasets as Topic ,Reproducibility of Results ,Coronary Artery Disease ,Multiomics ,United Kingdom ,Machine Learning ,Black or African American ,Cohort Studies ,Plasma ,Phenotype ,Metabolome ,Humans ,Metabolomics - Abstract
The use of omic modalities to dissect the molecular underpinnings of common diseases and traits is becoming increasingly common. But multi-omic traits can be genetically predicted, which enables highly cost-effective and powerful analyses for studies that do not have multi-omics1. Here we examine a large cohort (the INTERVAL study2; n = 50,000 participants) with extensive multi-omic data for plasma proteomics (SomaScan, n = 3,175; Olink, n = 4,822), plasma metabolomics (Metabolon HD4, n = 8,153), serum metabolomics (Nightingale, n = 37,359) and whole-blood Illumina RNA sequencing (n = 4,136), and use machine learning to train genetic scores for 17,227 molecular traits, including 10,521 that reach Bonferroni-adjusted significance. We evaluate the performance of genetic scores through external validation across cohorts of individuals of European, Asian and African American ancestries. In addition, we show the utility of these multi-omic genetic scores by quantifying the genetic control of biological pathways and by generating a synthetic multi-omic dataset of the UK Biobank3 to identify disease associations using a phenome-wide scan. We highlight a series of biological insights with regard to genetic mechanisms in metabolism and canonical pathway associations with disease; for example, JAK-STAT signalling and coronary atherosclerosis. Finally, we develop a portal ( https://www.omicspred.org/ ) to facilitate public access to all genetic scores and validation results, as well as to serve as a platform for future extensions and enhancements of multi-omic genetic scores.
- Published
- 2023