Back to Search Start Over

Long-read assembly and comparative evidence-based reanalysis ofCryptosporidiumgenome sequences reveal new biological insights

Authors :
Alan Tracey
Brendan R E Ansell
Matthew Berriman
Yiran Li
Rui Xiao
Rodrigo P. Baptista
Ethan D. Smith
Boris Striepen
Mandy Sanders
Jessica C. Kissinger
James Cotton
Karen Brooks
Aaron R. Jex
Garrett W. Cooper
Jennifer E. Dumaine
Adam Sateriale
Publication Year :
2021
Publisher :
Cold Spring Harbor Laboratory, 2021.

Abstract

Cryptosporidiosis is a leading cause of waterborne diarrheal disease globally and an important contributor to mortality in infants and the immunosuppressed. Despite its importance, theCryptosporidiumcommunity still relies on a fragmented reference genome sequence from 2004. Incomplete reference sequences hamper experimental design and interpretation. We have generated a newC. parvumIOWA genome assembly supported by PacBio and Oxford Nanopore long-read technologies and a new comparative and consistent genome annotation for three closely related speciesC. parvum,C. hominisandC. tyzzeri. The newC. parvumIOWA reference genome assembly is larger, gap free and lacks ambiguous bases. This chromosomal assembly recovers 13 of 16 possible telomeres and raises a new hypothesis for the remaining telomeres and associated subtelomeric regions. Comparative annotation revealed that most “missing” orthologs are found suggesting that species differences result primarily from structural rearrangements, gene copy number variation and SNVs inC. parvum, C. hominisandC. tyzzeri. We made >1,500C. parvum annotation updates based on experimental evidence. They included new transporters, ncRNAs, introns and altered gene structures. The new assembly and annotation revealed a complete DNA methylaseDnmt2ortholog. 190 genes under positive selection including many new candidates were identified using the new assembly and annotation as reference. Finally, possible subtelomeric amplification and variation events inC. parvumare detected that reveal a new level of genome plasticity that will both inform and impact future research.

Details

Database :
OpenAIRE
Accession number :
edsair.doi...........507020f0099a60200c9c2ac10aa2c7e8