1. Remarkably High Repeat Content in the Genomes of Sparrows: The Importance of Genome Assembly Completeness for Transposable Element Discovery
- Author
-
Benham, Phred M, Cicero, Carla, Escalona, Merly, Beraut, Eric, Fairbairn, Colin, Marimuthu, Mohan PA, Nguyen, Oanh, Sahasrabudhe, Ruta, King, Benjamin L, Thomas, W Kelley, Kovach, Adrienne I, Nachman, Michael W, and Bowie, Rauri CK
- Subjects
Biological Sciences ,Bioinformatics and Computational Biology ,Genetics ,Human Genome ,Animals ,DNA Transposable Elements ,Sparrows ,Sequence Analysis ,DNA ,Passerellidae ,transposable elements ,genome size ,California Conservation Genomics Project ,C-value ,Biochemistry and Cell Biology ,Evolutionary Biology ,Developmental Biology ,Biochemistry and cell biology ,Evolutionary biology - Abstract
Transposable elements (TE) play critical roles in shaping genome evolution. Highly repetitive TE sequences are also a major source of assembly gaps making it difficult to fully understand the impact of these elements on host genomes. The increased capacity of long-read sequencing technologies to span highly repetitive regions promises to provide new insights into patterns of TE activity across diverse taxa. Here we report the generation of highly contiguous reference genomes using PacBio long-read and Omni-C technologies for three species of Passerellidae sparrow. We compared these assemblies to three chromosome-level sparrow assemblies and nine other sparrow assemblies generated using a variety of short- and long-read technologies. All long-read based assemblies were longer (range: 1.12 to 1.41 Gb) than short-read assemblies (0.91 to 1.08 Gb) and assembly length was strongly correlated with the amount of repeat content. Repeat content for Bell's sparrow (31.2% of genome) was the highest level ever reported within the order Passeriformes, which comprises over half of avian diversity. The highest levels of repeat content (79.2% to 93.7%) were found on the W chromosome relative to other regions of the genome. Finally, we show that proliferation of different TE classes varied even among species with similar levels of repeat content. These patterns support a dynamic model of TE expansion and contraction even in a clade where TEs were once thought to be fairly depauperate and static. Our work highlights how the resolution of difficult-to-assemble regions of the genome with new sequencing technologies promises to transform our understanding of avian genome evolution.
- Published
- 2024