1. Large scale in silico characterization of repeat expansion variation in human genomes
- Author
-
Dana M. Bis-Brewer, Matt C. Danzi, Egor Dolzhenko, Sarah Fazal, Vívian Pedigone Cintra, Stephan Züchner, and Michael A. Eberle
- Subjects
Statistics and Probability ,endocrine system ,animal structures ,Population ,Datasets as Topic ,Alu element ,Library and Information Sciences ,Biology ,Polymorphism, Single Nucleotide ,Genome ,Education ,Structural variation ,03 medical and health sciences ,0302 clinical medicine ,Tandem repeat ,Alu Elements ,Humans ,education ,lcsh:Science ,030304 developmental biology ,0303 health sciences ,education.field_of_study ,Genome, Human ,Computer Science Applications ,Tandem Repeat Sequences ,Evolutionary biology ,Microsatellite instability ,Human genome ,lcsh:Q ,Statistics, Probability and Uncertainty ,Trinucleotide repeat expansion ,030217 neurology & neurosurgery ,hormones, hormone substitutes, and hormone antagonists ,Analysis ,Information Systems ,Reference genome - Abstract
Significant progress has been made in elucidating single nucleotide polymorphism diversity in the human population. However, the majority of the variation space in the genome is structural and remains partially elusive. One form of structural variation is tandem repeats (TRs). Expansion of TRs are responsible for over 40 diseases, but we hypothesize these represent only a fraction of the pathogenic repeat expansions that exist. Here we characterize long or expanded TR variation in 1,115 human genomes as well as a replication cohort of 2,504 genomes, identified using ExpansionHunter Denovo. We found that individual genomes typically harbor several rare, large TRs, generally in non-coding regions of the genome. We noticed that these large TRs are enriched in their proximity to Alu elements. The vast majority of these large TRs seem to be expansions of smaller TRs that are already present in the reference genome. We are providing this TR profile as a resource for comparison to undiagnosed rare disease genomes in order to detect novel disease-causing repeat expansions.
- Published
- 2020