18 results on '"Melanie Sorensen"'
Search Results
2. An evolutionary driver of interspersed segmental duplications in primates
- Author
-
Stuart Cantsilieris, Susan M. Sunkin, Matthew E. Johnson, Fabio Anaclerio, John Huddleston, Carl Baker, Max L. Dougherty, Jason G. Underwood, Arvis Sulovari, PingHsun Hsieh, Yafei Mao, Claudia Rita Catacchio, Maika Malig, AnneMarie E. Welch, Melanie Sorensen, Katherine M. Munson, Weihong Jiang, Santhosh Girirajan, Mario Ventura, Bruce T. Lamb, Ronald A. Conlon, and Evan E. Eichler
- Subjects
Segmental duplication ,Nuclear pore interacting protein ,LCR16a ,Gene fusion ,Genomic instability ,Biology (General) ,QH301-705.5 ,Genetics ,QH426-470 - Abstract
Abstract Background The complex interspersed pattern of segmental duplications in humans is responsible for rearrangements associated with neurodevelopmental disease, including the emergence of novel genes important in human brain evolution. We investigate the evolution of LCR16a, a putative driver of this phenomenon that encodes one of the most rapidly evolving human–ape gene families, nuclear pore interacting protein (NPIP). Results Comparative analysis shows that LCR16a has independently expanded in five primate lineages over the last 35 million years of primate evolution. The expansions are associated with independent lineage-specific segmental duplications flanking LCR16a leading to the emergence of large interspersed duplication blocks at non-orthologous chromosomal locations in each primate lineage. The intron-exon structure of the NPIP gene family has changed dramatically throughout primate evolution with different branches showing characteristic gene models yet maintaining an open reading frame. In the African ape lineage, we detect signatures of positive selection that occurred after a transition to more ubiquitous expression among great ape tissues when compared to Old World and New World monkeys. Mouse transgenic experiments from baboon and human genomic loci confirm these expression differences and suggest that the broader ape expression pattern arose due to mutational changes that emerged in cis. Conclusions LCR16a promotes serial interspersed duplications and creates hotspots of genomic instability that appear to be an ancient property of primate genomes. Dramatic changes to NPIP gene structure and altered tissue expression preceded major bouts of positive selection in the African ape lineage, suggestive of a gene undergoing strong adaptive evolution.
- Published
- 2020
- Full Text
- View/download PDF
3. A high-quality bonobo genome refines the analysis of hominid evolution
- Author
-
Melanie Sorensen, Yafei Mao, Sofie R. Salama, Claudia Rita Catacchio, Andy Wing Chun Pang, Françoise Thibaud-Nissen, Carl Baker, LaDeana W. Hillier, Ruiyang Li, Arvis Sulovari, Philip C. Dishuck, PingHsun Hsieh, Katherine M. Munson, Ludovica Mercuri, Jason D Fernandes, Jessica M. Storer, Joyce V. Lee, Benedict Paten, Mark A. Batzer, Peter A. Audano, David Porubsky, Tzu-Hsueh Huang, Jason G. Underwood, Evan E. Eichler, Jinna Hoffman, William T. Harvey, Kendra Hoekzema, Jerilyn A. Walker, Ian T. Fiddes, David Gordon, Marina Haukness, Alex Hastie, Alexandra P. Lewis, Francesca Antonacci, Mario Ventura, Shwetha C. Murali, Francesco Montinaro, Ilaria Piccolo, and Mark Diekhans
- Subjects
Pan troglodytes ,Sequence assembly ,Genomics ,Biology ,Genome informatics ,Genome ,Article ,Evolutionary genetics ,Coalescent theory ,Evolution, Molecular ,03 medical and health sciences ,Segmental Duplications, Genomic ,0302 clinical medicine ,Animals ,Sequencing ,Phylogeny ,030304 developmental biology ,Segmental duplication ,0303 health sciences ,Gorilla gorilla ,Multidisciplinary ,Bonobo ,Pongo ,Molecular Sequence Annotation ,Sequence Analysis, DNA ,Pan paniscus ,biology.organism_classification ,Genome evolution ,Genes ,Evolutionary biology ,Eukaryotic Initiation Factor-4A ,Female ,Human genome ,Mobile genetic elements ,030217 neurology & neurosurgery - Abstract
The divergence of chimpanzee and bonobo provides one of the few examples of recent hominid speciation1,2. Here we describe a fully annotated, high-quality bonobo genome assembly, which was constructed without guidance from reference genomes by applying a multiplatform genomics approach. We generate a bonobo genome assembly in which more than 98% of genes are completely annotated and 99% of the gaps are closed, including the resolution of about half of the segmental duplications and almost all of the full-length mobile elements. We compare the bonobo genome to those of other great apes1,3–5 and identify more than 5,569 fixed structural variants that specifically distinguish the bonobo and chimpanzee lineages. We focus on genes that have been lost, changed in structure or expanded in the last few million years of bonobo evolution. We produce a high-resolution map of incomplete lineage sorting and estimate that around 5.1% of the human genome is genetically closer to chimpanzee or bonobo and that more than 36.5% of the genome shows incomplete lineage sorting if we consider a deeper phylogeny including gorilla and orangutan. We also show that 26% of the segments of incomplete lineage sorting between human and chimpanzee or human and bonobo are non-randomly distributed and that genes within these clustered segments show significant excess of amino acid replacement compared to the rest of the genome., A high-quality bonobo genome assembly provides insights into incomplete lineage sorting in hominids and its relevance to gene evolution and the genetic relationship among living hominids.
- Published
- 2021
- Full Text
- View/download PDF
4. Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes
- Author
-
Karen H. Miga, Miten Jain, Simon Mayes, Kristof Tigyi, Kishwar Shafin, Benedict Paten, Nicholas Maurer, Ryan Lorig-Roach, Colleen M. Bosworth, Justin M. Zook, Duncan Kilburn, Hugh E. Olsen, Vania Costa, David Haussler, Kelvin J. Liu, Richard E. Green, Tobias Marschall, Melanie Sorensen, Paolo Carnevali, Sofie R. Salama, Erik Garrison, Mark Akeson, Marina Haukness, Mitchell R. Vollger, Katherine M. Munson, Fritz J. Sedlazeck, Adam M. Phillippy, Joel Armstrong, Jean Monlong, Sergey Koren, Evan E. Eichler, and Trevor Pesout
- Subjects
Sequence analysis ,Biomedical Engineering ,Sequence assembly ,Bioengineering ,Genomics ,Computational biology ,Haploidy ,Biology ,Applied Microbiology and Biotechnology ,Genome ,Article ,Chromosomes ,03 medical and health sciences ,Deep Learning ,0302 clinical medicine ,HLA Antigens ,Genetics ,Nanotechnology ,Chromosomes, Human ,Humans ,030304 developmental biology ,0303 health sciences ,Genome, Human ,Human Genome ,High-Throughput Nucleotide Sequencing ,Sequence Analysis, DNA ,DNA ,Benchmarking ,Nanopore Sequencing ,Nanopore ,Molecular Medicine ,Human genome ,Generic health relevance ,Nanopore sequencing ,Sequence Analysis ,Algorithms ,030217 neurology & neurosurgery ,Human ,Biotechnology ,Phred quality score - Abstract
De novo assembly of a human genome using nanopore long-read sequences has been reported but it used more than 150,000 CPU hours and weeks of wall-clock time. To enable rapid human genome assembly we present Shasta, a de novo long read assembler, and polishing algorithms named MarginPolish and HELEN. Using a single PromethION nanopore sequencer and our toolkit, we assembled eleven highly contiguous human genomes de novo in nine days. We achieved ~63x coverage, 42 Kb read N50, and 6.5x coverage in 100 Kb+ reads using three flow cells per sample. Shasta produced a complete haploid human genome assembly in under six hours on a single commercial compute node. MarginPolish and HELEN polished haploid assemblies to more than 99.9% identity (QV30) with nanopore reads alone. Addition of proximity ligation (Hi-C) sequencing enabled near chromosome-level scaffolds for all eleven genomes. We compare our assembly performance to existing methods for diploid, haploid, and trio-binned human samples and report superior accuracy and speed., Editors summary High contiguity human genomes can be assembled de novo in 6 hours using nanopore long-read sequences and the Shasta toolkit.
- Published
- 2020
- Full Text
- View/download PDF
5. Vam6/Vps39/ <scp>TRAP1</scp> ‐domain proteins influence vacuolar morphology, iron acquisition and virulence in Cryptococcus neoformans
- Author
-
Erik Bakkeren, Mélissa Caza, James W. Kronstad, Guanggan Hu, Linda C. Horianopoulos, Melanie Sorensen, Eddy Sánchez-León, and Won Hee Jung
- Subjects
Iron ,Immunology ,Mutant ,Saccharomyces cerevisiae ,Virulence ,medicine.disease_cause ,Microbiology ,Article ,Virulence factor ,Fungal Proteins ,Mice ,03 medical and health sciences ,Virology ,Protein targeting ,medicine ,Animals ,Transcription factor ,030304 developmental biology ,Cryptococcus neoformans ,0303 health sciences ,biology ,030306 microbiology ,Permease ,Cryptococcosis ,biology.organism_classification ,Cell biology ,Vacuoles - Abstract
The pathogenic fungus Cryptococcus neoformans must overcome iron limitation to cause disease in mammalian hosts. Previously, we reported a screen for insertion mutants with poor growth on haem as the sole iron source. In this study, we characterised one such mutant and found that the defective gene encoded a Vam6/Vps39/TRAP1 domain-containing protein required for robust growth on haem, an important iron source in host tissue. We designated this protein Vps3 based on reciprocal best matches with the corresponding protein in Saccharomyces cerevisiae. C. neoformans encodes a second Vam6/Vps39/TRAP1 domain-containing protein designated Vam6/Vlp1, and we found that this protein is also required for robust growth on haem as well as on inorganic iron sources. This protein is predicted to be a component of the homotypic fusion and vacuole protein sorting complex involved in endocytosis. Further characterisation of the vam6Δ and vps3Δ mutants revealed perturbed trafficking of iron acquisition functions (e.g., the high affinity iron permease Cft1) and impaired processing of the transcription factor Rim101, a regulator of haem and iron acquisition. The vps3Δ and vam6Δ mutants also had pleiotropic phenotypes including loss of virulence in a mouse model of cryptococcosis, reduced virulence factor elaboration and increased susceptibility to stress, indicating pleiotropic roles for Vps3 and Vam6 beyond haem use in C. neoformans. TAKE AWAYS: Two Vam6/Vps39/TRAP1-domain proteins, Vps3 and Vam6, support the growth of Cryptococcus neoformans on haem. Loss of Vps3 and Vam6 influences the trafficking and expression of iron uptake proteins. Loss of Vps3 or Vam6 eliminates the ability of C. neoformans to cause disease in a mouse model of cryptococcosis.
- Published
- 2021
- Full Text
- View/download PDF
6. Evidence for opposing selective forces operating on human-specific duplicated TCAF genes in Neanderthals and humans
- Author
-
Mitchell R. Vollger, Katherine M. Munson, Philip C. Dishuck, Vy Dang, Yafei Mao, PingHsun Hsieh, Tzu-Hsueh Huang, Melanie Sorensen, Alexandra P. Lewis, Carl Baker, AnneMarie E. Welch, Stuart Cantsilieris, Jason G. Underwood, and Evan E. Eichler
- Subjects
DNA Copy Number Variations ,Science ,General Physics and Astronomy ,Locus (genetics) ,Evolutionary biology ,Biology ,Genome informatics ,Genome ,General Biochemistry, Genetics and Molecular Biology ,Haplogroup ,Article ,Evolution, Molecular ,Gene Duplication ,Gene duplication ,Animals ,Humans ,Copy-number variation ,Selection, Genetic ,Phylogeny ,Segmental duplication ,Neanderthals ,Multidisciplinary ,Genome, Human ,Haplotype ,Membrane Proteins ,Hominidae ,General Chemistry ,Haplotypes ,Homo sapiens - Abstract
TRP channel-associated factor 1/2 (TCAF1/TCAF2) proteins antagonistically regulate the cold-sensor protein TRPM8 in multiple human tissues. Understanding their significance has been complicated given the locus spans a gap-ridden region with complex segmental duplications in GRCh38. Using long-read sequencing, we sequence-resolve the locus, annotate full-length TCAF models in primate genomes, and show substantial human-specific TCAF copy number variation. We identify two human super haplogroups, H4 and H5, and establish that TCAF duplications originated ~1.7 million years ago but diversified only in Homo sapiens by recurrent structural mutations. Conversely, in all archaic-hominin samples the fixation for a specific H4 haplotype without duplication is likely due to positive selection. Here, our results of TCAF copy number expansion, selection signals in hominins, and differential TCAF2 expression between haplogroups and high TCAF2 and TRPM8 expression in liver and prostate in modern-day humans imply TCAF diversification among hominins potentially in response to cold or dietary adaptations., Duplications of gene segments can allow novel physiological adaptations to evolve. A detailed analysis of the TCAF gene family in primates and archaic humans suggest rapid duplication and diversification in this gene family is associated with cold or dietary adaptations.
- Published
- 2021
7. Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads
- Author
-
Peter Ebert, Peter A. Audano, Peter M. Lansdorp, Mark Chaisson, Maryam Ghareghani, Katherine M. Munson, Ashley D. Sanders, Charles Lee, Marina Haukness, Arvis Sulovari, Jana Ebler, Benedict Paten, Evan E. Eichler, Scott E. Devine, Jan O. Korbel, Melanie Sorensen, Tobias Marschall, William T. Harvey, David Porubsky, Mitchell R. Vollger, Pierre Marijon, and Human Genome Structural Variation Consortium
- Subjects
Parents ,Cancer Research ,Letter ,Sequence analysis ,Bioinformatics ,0206 medical engineering ,Biomedical Engineering ,Sequence assembly ,Bioengineering ,02 engineering and technology ,Computational biology ,Biology ,Genome informatics ,Applied Microbiology and Biotechnology ,Genome ,Genomic analysis ,03 medical and health sciences ,Sequencing ,Humans ,Indel ,030304 developmental biology ,0303 health sciences ,Contig ,Genome, Human ,Haplotype ,Puerto Rico ,High-Throughput Nucleotide Sequencing ,Sequence Analysis, DNA ,Haplotypes ,Molecular Medicine ,Human genome ,Nanopore sequencing ,Single-Cell Analysis ,020602 bioinformatics ,Algorithms ,Biotechnology - Abstract
Human genomes are typically assembled as consensus sequences that lack information on parental haplotypes. Here we describe a reference-free workflow for diploid de novo genome assembly that combines the chromosome-wide phasing and scaffolding capabilities of single-cell strand sequencing1,2 with continuous long-read or high-fidelity3 sequencing data. Employing this strategy, we produced a completely phased de novo genome assembly for each haplotype of an individual of Puerto Rican descent (HG00733) in the absence of parental data. The assemblies are accurate (quality value > 40) and highly contiguous (contig N50 > 23 Mbp) with low switch error rates (0.17%), providing fully phased single-nucleotide variants, indels and structural variants. A comparison of Oxford Nanopore Technologies and Pacific Biosciences phased assemblies identified 154 regions that are preferential sites of contig breaks, irrespective of sequencing technology or phasing algorithms., Assembly of haplotype-resolved human genomes is achieved by combining short and long reads.
- Published
- 2021
8. The structure, function, and evolution of a complete human chromosome 8
- Author
-
Yafei Mao, Alexandra M. Lewis, Mario Ventura, Vladimir Larionov, Tina A. Graves-Lindsay, Mikhail Liskovykh, Adam M. Phillippy, Evan E. Eichler, Philip C. Dishuck, Melanie Sorensen, Shwetha C. Murali, Urvashi Surti, Chirag Jain, David Porubsky, Andrey Bzikadze, Ludovica Mercuri, Glennis A. Logsdon, Karen H. Miga, PingHsun Hsieh, Sergey Koren, Milinn Kremitzki, Katherine M. Munson, Leonardo G. de Lima, Carl Baker, Arang Rhie, Jennifer L. Gerton, Mitchell R. Vollger, Sergey Nurk, and Kendra Hoekzema
- Subjects
Mutation rate ,Variable number tandem repeat ,Autosome ,Neocentromere ,Phylogenetic tree ,Evolutionary biology ,Satellite DNA ,Kinetochore ,Chromosome ,Biology - Abstract
The complete assembly of each human chromosome is essential for understanding human biology and evolution. Using complementary long-read sequencing technologies, we complete the first linear assembly of a human autosome, chromosome 8. Our assembly resolves the sequence of five previously long-standing gaps, including a 2.08 Mbp centromeric α-satellite array, a 644 kbp defensin copy number polymorphism important for disease risk, and an 863 kbp variable number tandem repeat at chromosome 8q21.2 that can function as a neocentromere. We show that the centromeric α-satellite array is generally methylated except for a 73 kbp hypomethylated region of diverse higher-order α-satellite enriched with CENP-A nucleosomes, consistent with the location of the kinetochore. Using a dual long-read sequencing approach, we complete the assembly of the orthologous chromosome 8 centromeric regions in chimpanzee, orangutan, and macaque for the first time to reconstruct its evolutionary history. Comparative and phylogenetic analyses show that the higher-order α-satellite structure evolved specifically in the great ape ancestor, and the centromeric region evolved with a layered symmetry, with more ancient higher-order repeats located at the periphery adjacent to monomeric α-satellites. We estimate that the mutation rate of centromeric satellite DNA is accelerated at least 2.2-fold, and this acceleration extends beyond the higher-order α-satellite into the flanking sequence.
- Published
- 2020
- Full Text
- View/download PDF
9. The structure, function and evolution of a complete human chromosome 8
- Author
-
Philip C. Dishuck, Karen H. Miga, Leonardo G. de Lima, Glennis A. Logsdon, Mario Ventura, Katherine M. Munson, Sergey Koren, Mikhail Liskovykh, Evan E. Eichler, Sergey Nurk, Tina A. Graves-Lindsay, Chirag Jain, Andrey Bzikadze, Tatiana Dvorkina, Alexandra M. Lewis, William T. Harvey, Kendra Hoekzema, Arang Rhie, Carl Baker, Jennifer L. Gerton, Ludovica Mercuri, Shwetha C. Murali, Yafei Mao, David Porubsky, Vladimir Larionov, Adam M. Phillippy, Mitchell R. Vollger, Melanie Sorensen, Alla Mikheenko, PingHsun Hsieh, Milinn Kremitzki, and Urvashi Surti
- Subjects
Male ,Genome evolution ,Pongo abelii ,Neocentromere ,Pan troglodytes ,Centromere ,Genomics ,Minisatellite Repeats ,Biology ,DNA, Satellite ,Genome ,Article ,Evolutionary genetics ,Cell Line ,Epigenesis, Genetic ,Evolution, Molecular ,03 medical and health sciences ,0302 clinical medicine ,Animals ,Humans ,Phylogeny ,030304 developmental biology ,Centromeres ,0303 health sciences ,Multidisciplinary ,Human evolutionary genetics ,Chromosome ,DNA Methylation ,Telomere ,Macaca mulatta ,Evolutionary biology ,Human genome ,Female ,030217 neurology & neurosurgery ,Chromosomes, Human, Pair 8 - Abstract
The complete assembly of each human chromosome is essential for understanding human biology and evolution1,2. Here we use complementary long-read sequencing technologies to complete the linear assembly of human chromosome 8. Our assembly resolves the sequence of five previously long-standing gaps, including a 2.08-Mb centromeric α-satellite array, a 644-kb copy number polymorphism in the β-defensin gene cluster that is important for disease risk, and an 863-kb variable number tandem repeat at chromosome 8q21.2 that can function as a neocentromere. We show that the centromeric α-satellite array is generally methylated except for a 73-kb hypomethylated region of diverse higher-order α-satellites enriched with CENP-A nucleosomes, consistent with the location of the kinetochore. In addition, we confirm the overall organization and methylation pattern of the centromere in a diploid human genome. Using a dual long-read sequencing approach, we complete high-quality draft assemblies of the orthologous centromere from chromosome 8 in chimpanzee, orangutan and macaque to reconstruct its evolutionary history. Comparative and phylogenetic analyses show that the higher-order α-satellite structure evolved in the great ape ancestor with a layered symmetry, in which more ancient higher-order repeats locate peripherally to monomeric α-satellites. We estimate that the mutation rate of centromeric satellite DNA is accelerated by more than 2.2-fold compared to the unique portions of the genome, and this acceleration extends into the flanking sequence., The complete assembly of human chromosome 8 resolves previous gaps and reveals hidden complex forms of genetic variation, enabling functional and evolutionary characterization of primate centromeres.
- Published
- 2020
10. Long-read sequence and assembly of segmental duplications
- Author
-
Evan E. Eichler, Richard K. Wilson, Vy Dang, Anne Marie E. Welch, Max L. Dougherty, Tina A. Graves-Lindsay, Philip C. Dishuck, Melanie Sorensen, Mark Chaisson, and Mitchell R. Vollger
- Subjects
Oxford Nanopore Technologies (ONT) ,Sequence analysis ,segmental duplication ,long-read ,Computational biology ,Biology ,Biochemistry ,Genome ,Article ,03 medical and health sciences ,Segmental Duplications, Genomic ,Humans ,Molecular Biology ,030304 developmental biology ,Sequence (medicine) ,Segmental duplication ,PacBio sequencing ,0303 health sciences ,Genome, Human ,gene duplication ,Computational Biology ,Molecular Sequence Annotation ,Sequence Analysis, DNA ,Cell Biology ,Gene Annotation ,real-time (SMRT) sequence ,Human genome ,single-molecule ,Biotechnology ,Reference genome - Abstract
We have developed a computational method based on polyploid phasing of long sequence reads to resolve collapsed regions of segmental duplications within genome assemblies. Segmental Duplication Assembler (SDA; https://github.com/mvollger/SDA ) constructs graphs in which paralogous sequence variants define the nodes and long-read sequences provide attraction and repulsion edges, enabling the partition and assembly of long reads corresponding to distinct paralogs. We apply it to single-molecule, real-time sequence data from three human genomes and recover 33-79 megabase pairs (Mb) of duplications in which approximately half of the loci are diverged (99.8%) compared to the reference genome. We show that the corresponding sequence is highly accurate (99.9%) and that the diverged sequence corresponds to copy-number-variable paralogs that are absent from the human reference genome. Our method can be applied to other complex genomes to resolve the last gene-rich gaps, improve duplicate gene annotation, and better understand copy-number-variant genetic diversity at the base-pair level.
- Published
- 2018
- Full Text
- View/download PDF
11. Evolutionary Dynamics of the POTE Gene Family in Human and Nonhuman Primates
- Author
-
Francesco Maria Calabrese, Mario Ventura, Nicola Lorusso, Claudia Rita Catacchio, Flavia Angela Maria Maggiolini, Giuliana Giannuzzi, Ludovica Mercuri, Alberto L'Abbate, Evan E. Eichler, Francesca Antonacci, Fabio Anaclerio, and Melanie Sorensen
- Subjects
Male ,0301 basic medicine ,lcsh:QH426-470 ,Placenta ,primates ,Computational biology ,Biology ,Genome ,Article ,Evolution, Molecular ,Molecular cytogenetics ,03 medical and health sciences ,0302 clinical medicine ,Pregnancy ,Testis ,evolution ,Genetics ,Animals ,Humans ,Gene family ,Computer Simulation ,Tissue Distribution ,Gene ,Genetics (clinical) ,Segmental duplication ,Gene Expression Profiling ,Ovary ,Prostate ,Chromosome Mapping ,centromeres ,Single Molecule Imaging ,3. Good health ,Long interspersed nuclear element ,lcsh:Genetics ,030104 developmental biology ,Gene Expression Regulation ,Multigene Family ,030220 oncology & carcinogenesis ,gene family ,Female ,Tandem exon duplication ,Reference genome - Abstract
POTE (prostate, ovary, testis, and placenta expressed) genes belong to a primate-specific gene family expressed in prostate, ovary, and testis as well as in several cancers including breast, prostate, and lung cancers. Due to their tumor-specific expression, POTEs are potential oncogenes, therapeutic targets, and biomarkers for these malignancies. This gene family maps within human and primate segmental duplications with a copy number ranging from two to 14 in different species. Due to the high sequence identity among the gene copies, specific efforts are needed to assemble these loci in order to correctly define the organization and evolution of the gene family. Using single-molecule, real-time (SMRT) sequencing, in silico analyses, and molecular cytogenetics, we characterized the structure, copy number, and chromosomal distribution of the POTE genes, as well as their expression in normal and disease tissues, and provided a comparative analysis of the POTE organization and gene structure in primate genomes. We were able, for the first time, to de novo sequence and assemble a POTE tandem duplication in marmoset that is misassembled and collapsed in the reference genome, thus revealing the presence of a second POTE copy. Taken together, our findings provide comprehensive insights into the evolutionary dynamics of the primate-specific POTE gene family, involving gene duplications, deletions, and long interspersed nuclear element (LINE) transpositions to explain the actual repertoire of these genes in human and primate genomes.
- Published
- 2020
- Full Text
- View/download PDF
12. Recurrent inversion toggling and great ape genome evolution
- Author
-
Shwetha C. Murali, Melanie Sorensen, Wolfram Höps, David Porubsky, David Gordon, Alex A. Pollen, Jan O. Korbel, Tobias Marschall, Ashley D. Sanders, Evan E. Eichler, PingHsun Hsieh, Francesca Antonacci, Arvis Sulovari, Stuart Cantsilieris, Ludovica Mercuri, Ruiyang Li, and Mario Ventura
- Subjects
Male ,Genome evolution ,DNA Copy Number Variations ,Evolution ,Locus (genetics) ,Biology ,Medical and Health Sciences ,Article ,Chromosomes ,Evolution, Molecular ,03 medical and health sciences ,0302 clinical medicine ,Gene duplication ,Genetics ,Animals ,Humans ,Copy-number variation ,Gene ,X chromosome ,030304 developmental biology ,0303 health sciences ,Genome ,Breakpoint ,Haplotype ,Human Genome ,Molecular ,Hominidae ,Biological Sciences ,Haplotypes ,Evolutionary biology ,Chromosome Inversion ,Female ,030217 neurology & neurosurgery ,Biotechnology ,Developmental Biology - Abstract
Inversions play an important role in disease and evolution but are difficult to characterize because their breakpoints map to large repeats. We increased by sixfold the number (n = 1,069) of previously reported great ape inversions by using single-cell DNA template strand and long-read sequencing. We find that the X chromosome is most enriched (2.5-fold) for inversions, on the basis of its size and duplication content. There is an excess of differentially expressed primate genes near the breakpoints of large (>100 kilobases (kb)) inversions but not smaller events. We show that when great ape lineage-specific duplications emerge, they preferentially (approximately 75%) occur in an inverted orientation compared to that at their ancestral locus. We construct megabase-pair scale haplotypes for individual chromosomes and identify 23 genomic regions that have recurrently toggled between a direct and an inverted state over 15 million years. The direct orientation is most frequently the derived state for human polymorphisms that predispose to recurrent copy number variants associated with neurodevelopmental disease.
- Published
- 2020
- Full Text
- View/download PDF
13. A fully phased accurate assembly of an individual human genome
- Author
-
Katherine M. Munson, Mark Chaisson, Jan O. Korbel, Tobias Marschall, Scott E. Devine, Benedict Paten, Ashley D. Sanders, Charles Lee, Evan E. Eichler, William T. Harvey, Marina Haukness, Maryam Ghareghani, Mitchell R. Vollger, Melanie Sorensen, P. Ebert, David Porubsky, Peter A. Audano, Peter M. Lansdorp, and Arvis Sulovari
- Subjects
0303 health sciences ,Contig ,Haplotype ,Sequence assembly ,Computational biology ,Biology ,Genome ,03 medical and health sciences ,0302 clinical medicine ,Consensus sequence ,Human genome ,Nanopore sequencing ,Indel ,030217 neurology & neurosurgery ,030304 developmental biology - Abstract
The prevailing genome assembly paradigm is to produce consensus sequences that “collapse” parental haplotypes into a consensus sequence. Here, we leverage the chromosome-wide phasing and scaffolding capabilities of single-cell strand sequencing (Strand-seq)1,2 and combine them with high-fidelity (HiFi) long sequencing reads3, in a novel reference-free workflow for diploid de novo genome assembly. Employing this strategy, we produce completely phased de novo genome assemblies separately for each haplotype of a single individual of Puerto Rican origin (HG00733) in the absence of parental data. The assemblies are accurate (QV > 40), highly contiguous (contig N50 > 25 Mbp) with low switch error rates (0.4%) providing fully phased single-nucleotide variants (SNVs), indels, and structural variants (SVs). A comparison of Oxford Nanopore and PacBio phased assemblies identifies 150 regions that are preferential sites of contig breaks irrespective of sequencing technology or phasing algorithms.
- Published
- 2019
- Full Text
- View/download PDF
14. Adaptive archaic introgression of copy number variants and the discovery of previously unknown human genes
- Author
-
Katherine M. Munson, Evan E. Eichler, PingHsun Hsieh, Flavia Angela Maria Maggiolini, Giorgia Chiatante, Alexandra P. Lewis, Francesca Antonacci, Vy Dang, Carl Baker, Stuart Cantsilieris, David Porubsky, Bradley J. Nelson, Shwetha C. Murali, Melanie Sorensen, Jean-François Deleuze, Kendra Hoekzema, Jason G. Underwood, Mitchell R. Vollger, Hélène Blanché, and Zev N. Kronenberg
- Subjects
DNA Copy Number Variations ,Introgression ,Biology ,Genetic Introgression ,Genome ,Article ,Evolution, Molecular ,mental disorders ,Chromosome Duplication ,Genetic variation ,Animals ,Humans ,Copy-number variation ,Selection, Genetic ,Neanderthals ,Whole genome sequencing ,Polymorphism, Genetic ,Multidisciplinary ,Models, Genetic ,Whole Genome Sequencing ,Genome, Human ,Haplotype ,Hominidae ,Haplotypes ,Evolutionary biology ,Human genome ,Melanesia ,Adaptation ,Chromosomes, Human, Pair 16 ,Chromosomes, Human, Pair 8 - Abstract
Adaptive archaic hominin genes As they migrated out of Africa and into Europe and Asia, anatomically modern humans interbred with archaic hominins, such as Neanderthals and Denisovans. The result of this genetic introgression on the recipient populations has been of considerable interest, especially in cases of selection for specific archaic genetic variants. Hsieh et al. characterized adaptive structural variants and copy number variants that are likely targets of positive selection in Melanesians. Focusing on population-specific regions of the genome that carry duplicated genes and show an excess of amino acid replacements provides evidence for one of the mechanisms by which genetic novelty can arise and result in differentiation between human genomes. Science , this issue p. eaax2083
- Published
- 2019
- Full Text
- View/download PDF
15. Efficient de novo assembly of eleven human genomes using PromethION sequencing and a novel nanopore toolkit
- Author
-
Nicholas Maurer, Karen H. Miga, Hugh E. Olsen, Tobias Marschall, Kelvin J. Liu, Colleen M. Bosworth, Kristof Tigyi, Simon Mayes, Joel Armstrong, Mark Akeson, Benedict Paten, Evan E. Eichler, Trevor Pesout, Marina Haukness, Katherine M. Munson, Fritz J. Sedlazeck, Sofie R. Salama, Costa, Richard E. Green, Duncan Kilburn, Melanie Sorensen, Paolo Carnevali, Miten Jain, Adam M. Phillippy, Ryan Lorig-Roach, Sergey Koren, Mitchell R. Vollger, Kishwar Shafin, David Haussler, and Justin M. Zook
- Subjects
0303 health sciences ,Computer science ,Sequence assembly ,Computational biology ,Genome ,03 medical and health sciences ,Nanopore ,0302 clinical medicine ,Human genome ,Nanopore sequencing ,Ploidy ,Ligation ,030217 neurology & neurosurgery ,030304 developmental biology - Abstract
Present workflows for producing human genome assemblies from long-read technologies have cost and production time bottlenecks that prohibit efficient scaling to large cohorts. We demonstrate an optimized PromethION nanopore sequencing method for eleven human genomes. The sequencing, performed on one machine in nine days, achieved an average 63x coverage, 42 Kb read N50, 90% median read identity and 6.5x coverage in 100 Kb+ reads using just three flow cells per sample. To assemble these data we introduce new computational tools: Shasta - ade novolong read assembler, and MarginPolish & HELEN - a suite of nanopore assembly polishing algorithms. On a single commercial compute node Shasta can produce a complete human genome assembly in under six hours, and MarginPolish & HELEN can polish the result in just over a day, achieving 99.9% identity (QV30) for haploid samples from nanopore reads alone. We evaluate assembly performance for diploid, haploid and trio-binned human samples in terms of accuracy, cost, and time and demonstrate improvements relative to current state-of-the-art methods in all areas. We further show that addition of proximity ligation (Hi-C) sequencing yields near chromosome-level scaffolds for all eleven genomes.
- Published
- 2019
- Full Text
- View/download PDF
16. Recurrent structural variation, clustered sites of selection, and disease risk for the complement factor H ( CFH ) gene family
- Author
-
Melanie Sorensen, Andrea J. Richardson, Robyn H. Guymer, John Huddleston, Kelsi Penewit, Katherine M. Munson, Rando Allikmets, Felix Grassmann, Vy Dang, Tina A. Graves-Lindsay, Bernhard H. F. Weber, Bradley J. Nelson, Anne Marie E. Welch, Carl Baker, Stuart Cantsilieris, Paul N. Baird, Evan E. Eichler, Richard K. Wilson, and Lana Harshman
- Subjects
Primates ,0301 basic medicine ,Nonsynonymous substitution ,Genotype ,Locus (genetics) ,Biology ,Polymorphism, Single Nucleotide ,Evolution, Molecular ,Macular Degeneration ,03 medical and health sciences ,0302 clinical medicine ,Risk Factors ,Gene duplication ,Animals ,Humans ,Missense mutation ,Gene family ,Genetic Predisposition to Disease ,Selection, Genetic ,Gene ,Genetics ,Multidisciplinary ,Haplotype ,Exons ,eye diseases ,Phenotype ,030104 developmental biology ,PNAS Plus ,Haplotypes ,Complement Factor H ,Multigene Family ,Factor H ,Mutation ,030217 neurology & neurosurgery - Abstract
Structural variation and single-nucleotide variation of the complement factor H (CFH) gene family underlie several complex genetic diseases, including age-related macular degeneration (AMD) and atypical hemolytic uremic syndrome (AHUS). To understand its diversity and evolution, we performed high-quality sequencing of this ∼360-kbp locus in six primate lineages, including multiple human haplotypes. Comparative sequence analyses reveal two distinct periods of gene duplication leading to the emergence of four CFH-related (CFHR) gene paralogs (CFHR2 and CFHR4 ∼25-35 Mya and CFHR1 and CFHR3 ∼7-13 Mya). Remarkably, all evolutionary breakpoints share a common ∼4.8-kbp segment corresponding to an ancestral CFHR gene promoter that has expanded independently throughout primate evolution. This segment is recurrently reused and juxtaposed with a donor duplication containing exons 8 and 9 from ancestral CFH, creating four CFHR fusion genes that include lineage-specific members of the gene family. Combined analysis of >5,000 AMD cases and controls identifies a significant burden of a rare missense mutation that clusters at the N terminus of CFH [P = 5.81 × 10-8, odds ratio (OR) = 9.8 (3.67-Infinity)]. A bipolar clustering pattern of rare nonsynonymous mutations in patients with AMD (P 2,400 individuals reveals five recurrent rearrangement breakpoints that show variable frequency among AMD cases and controls. These data suggest a dynamic and recurrent pattern of mutation critical to the emergence of new CFHR genes but also in the predisposition to complex human genetic disease phenotypes.
- Published
- 2018
- Full Text
- View/download PDF
17. High-resolution comparative analysis of great ape genomes
- Author
-
Mark Chaisson, Jay Shendure, Melanie Sorensen, Christopher M. Hill, Zev N. Kronenberg, Fred H. Gage, Andy Wing Chun Pang, Shwetha C. Murali, Olivia S. Meyerson, Jason G. Underwood, Carl Baker, Kendra Hoekzema, Ruolan Qiu, Bradley J. Nelson, Katherine M. Munson, Susan K. Dutcher, Ahmet M. Denli, Wesley C. Warren, Stuart Cantsilieris, Archana Raja, Ernest T. Lam, Alex Hastie, Richard K. Wilson, Karen Clark, Benedict Paten, David Haussler, Tina A. Graves-Lindsay, Joyce V. Lee, Emma R. Hoppe, Alex A. Pollen, Mark Diekhans, Valerie A. Schneider, Nicola Lorusso, Robert S. Fulton, Fereydoun Hormozdiari, David Gordon, Mario Ventura, Evan E. Eichler, Anne Marie E. Welch, Joel Armstrong, Max L. Dougherty, PingHsun Hsieh, Han Cao, and Ian T. Fiddes
- Subjects
0301 basic medicine ,Lineage (genetic) ,Evolution ,General Science & Technology ,1.1 Normal biological development and functioning ,Retrotransposon ,Biology ,Genome ,Article ,Structural variation ,Evolution, Molecular ,03 medical and health sciences ,Contig Mapping ,0302 clinical medicine ,Underpinning research ,Genetics ,2.1 Biological and endogenous factors ,Animals ,Humans ,Aetiology ,Gene ,Synteny ,Bacterial artificial chromosome ,Multidisciplinary ,Genome, Human ,Human Genome ,Molecular ,Genetic Variation ,Hominidae ,Molecular Sequence Annotation ,DNA ,Sequence Analysis, DNA ,Stem Cell Research ,030104 developmental biology ,Evolutionary biology ,Generic health relevance ,Sequence Analysis ,030217 neurology & neurosurgery ,Reference genome ,Human ,Biotechnology - Abstract
INTRODUCTION Understanding the genetic differences that make us human is a long-standing endeavor that requires the comprehensive discovery and comparison of all forms of genetic variation within great ape lineages. RATIONALE The varied quality and completeness of ape genomes have limited comparative genetic analyses. To eliminate this contiguity and quality disparity, we generated human and nonhuman ape genome assemblies without the guidance of the human reference genome. These new genome assemblies enable both coarse and fine-scale comparative genomic studies. RESULTS We sequenced and assembled two human, one chimpanzee, and one orangutan genome using high-coverage (>65x) single-molecule, real-time (SMRT) long-read sequencing technology. We also sequenced more than 500,000 full-length complementary DNA samples from induced pluripotent stem cells to construct de novo gene models, increasing our knowledge of transcript diversity in each ape lineage. The new nonhuman ape genome assemblies improve gene annotation and genomic contiguity (by 30- to 500-fold), resulting in the identification of larger synteny blocks (by 22- to 74-fold) when compared to earlier assemblies. Including the latest gorilla genome, we now estimate that 83% of the ape genomes can be compared in a multiple sequence alignment. We observe a modest increase in single-nucleotide variant divergence compared to previous genome analyses and estimate that 36% of human autosomal DNA is subject to incomplete lineage sorting. We fully resolve most common repeat differences, including full-length retrotransposons such as the African ape-specific endogenous retroviral element PtERV1. We show that the spread of this element independently in the gorilla and chimpanzee lineage likely resulted from a founder element that failed to segregate to the human lineage because of incomplete lineage sorting. The improved sequence contiguity allowed a more systematic discovery of structural variation (>50 base pairs in length) (see the figure). We detected 614,186 ape deletions, insertions, and inversions, assigning each to specific ape lineages. Unbiased genome scaffolding (optical maps, bacterial artificial chromosome sequencing, and fluorescence in situ hybridization) led to the discovery of large, unknown complex inversions in gene-rich regions. Of the 17,789 fixed human-specific insertions and deletions, we focus on those of potential functional effect. We identify 90 that are predicted to disrupt genes and an additional 643 that likely affect regulatory regions, more than doubling the number of human-specific deletions that remove regulatory sequence in the human lineage. We investigate the association of structural variation with changes in human-chimpanzee brain gene expression using cerebral organoids as a proxy for expression differences. Genes associated with fixed structural variants (SVs) show a pattern of down-regulation in human radial glial neural progenitors, whereas human-specific duplications are associated with up-regulated genes in human radial glial and excitatory neurons (see the figure). CONCLUSION The improved ape genome assemblies provide the most comprehensive view to date of intermediate-size structural variation and highlight several dozen genes associated with structural variation and brain-expression differences between humans and chimpanzees. These new references will provide a stepping stone for the completion of great ape genomes at a quality commensurate with the human reference genome and, ultimately, an understanding of the genetic differences that make us human.
- Published
- 2017
18. Characterizing the Major Structural Variant Alleles of the Human Genome
- Author
-
Bradley J. Nelson, Susan K. Dutcher, AnneMarie E. Welch, Sean McGrath, Max L. Dougherty, Arvis Sulovari, Stuart Cantsilieris, Vincent Magrini, Yang I. Li, Peter A. Audano, Evan E. Eichler, Tina A. Graves-Lindsay, Ankeeta Shah, Wesley C. Warren, Richard K. Wilson, and Melanie Sorensen
- Subjects
Euchromatin ,Minisatellite Repeats ,Computational biology ,Biology ,Genome ,Article ,General Biochemistry, Genetics and Molecular Biology ,Structural variation ,03 medical and health sciences ,0302 clinical medicine ,Gene Frequency ,Humans ,Allele ,Genotyping ,Alleles ,030304 developmental biology ,Sequence (medicine) ,0303 health sciences ,Genome, Human ,Genomics ,Sequence Analysis, DNA ,Variable number tandem repeat ,Genomic Structural Variation ,Human genome ,030217 neurology & neurosurgery - Abstract
In order to provide a comprehensive resource for human structural variants (SVs), we generated long-read sequence data and analyzed SVs for fifteen human genomes. We sequence resolved 99,604 insertions, deletions, and inversions including 2,238 (1.6 Mbp) that are shared among all discovery genomes with an additional 13,053 (6.9 Mbp) present in the majority, indicating minor alleles or errors in the reference. Genotyping in 440 additional genomes confirms the most common SVs in unique euchromatin are now sequence resolved. We report a ninefold SV bias toward the last 5 Mbp of human chromosomes with nearly 55% of all VNTRs (variable number of tandem repeats) mapping to this portion of the genome. We identify SVs affecting coding and noncoding regulatory loci improving annotation and interpretation of functional variation. These data provide the framework to construct a canonical human reference and a resource for developing advanced representations capable of capturing allelic diversity.
- Published
- 2019
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.