1. A blended genome and exome sequencing method captures genetic variation in an unbiased, high-quality, and cost-effective manner.
- Author
-
Boltz TA, Chu BB, Liao C, Sealock JM, Ye R, Majara L, Fu JM, Service S, Zhan L, Medland SE, Chapman SB, Rubinacci S, DeFelice M, Grimsby JL, Abebe T, Alemayehu M, Ashaba FK, Atkinson EG, Bigdeli T, Bradway AB, Brand H, Chibnik LB, Fekadu A, Gatzen M, Gelaye B, Gichuru S, Gildea ML, Hill TC, Huang H, Hubbard KM, Injera WE, James R, Joloba M, Kachulis C, Kalmbach PR, Kamulegeya R, Kigen G, Kim S, Koen N, Kwobah EK, Kyebuzibwa J, Lee S, Lennon NJ, Lind PA, Lopera-Maya EA, Makale J, Mangul S, McMahon J, Mowlem P, Musinguzi H, Mwema RM, Nakasujja N, Newman CP, Nkambule LL, O'Neil CR, Olivares AM, Olsen CM, Ongeri L, Parsa SJ, Pretorius A, Ramesar R, Reagan FL, Sabatti C, Schneider JA, Shiferaw W, Stevenson A, Stricker E, Stroud RE 2nd, Tang J, Whiteman D, Yohannes MT, Yu M, Yuan K, Akena D, Atwoli L, Kariuki SM, Koenen KC, Newton CRJC, Stein DJ, Teferra S, Zingela Z, Pato CN, Pato MT, Lopez-Jaramillo C, Freimer N, Ophoff RA, Olde Loohuis LM, Talkowski ME, Neale BM, Howrigan DP, and Martin AR
- Abstract
We deployed the Blended Genome Exome (BGE), a DNA library blending approach that generates low pass whole genome (1-4× mean depth) and deep whole exome (30-40× mean depth) data in a single sequencing run. This technology is cost-effective, empowers most genomic discoveries possible with deep whole genome sequencing, and provides an unbiased method to capture the diversity of common SNP variation across the globe. To evaluate this new technology at scale, we applied BGE to sequence >53,000 samples from the Populations Underrepresented in Mental Illness Associations Studies (PUMAS) Project, which included participants across African, African American, and Latin American populations. We evaluated the accuracy of BGE imputed genotypes against raw genotype calls from the Illumina Global Screening Array. All PUMAS cohorts had R 2 concordance ≥95% among SNPs with MAF≥1%, and never fell below ≥90% R 2 for SNPs with MAF<1%. Furthermore, concordance rates among local ancestries within two recently admixed cohorts were consistent among SNPs with MAF≥1%, with only minor deviations in SNPs with MAF<1%. We also benchmarked the discovery capacity of BGE to access protein-coding copy number variants (CNVs) against deep whole genome data, finding that deletions and duplications spanning at least 3 exons had a positive predicted value of ~90%. Our results demonstrate BGE scalability and efficacy in capturing SNPs, indels, and CNVs in the human genome at 28% of the cost of deep whole-genome sequencing. BGE is poised to enhance access to genomic testing and empower genomic discoveries, particularly in underrepresented populations.
- Published
- 2024
- Full Text
- View/download PDF