Back to Search
Start Over
Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing
- Source :
- DNA Research, DNA Research: An International Journal for Rapid Publication of Reports on Genes and Genomes
-
Abstract
- The application of next-generation sequencing to estimate genetic diversity of Plasmodium falciparum, the most lethal malaria parasite, has proved challenging due to the skewed AT-richness [∼80.6% (A + T)] of its genome and the lack of technology to assemble highly polymorphic subtelomeric regions that contain clonally variant, multigene virulence families (Ex: var and rifin). To address this, we performed amplification-free, single molecule, real-time sequencing of P. falciparum genomic DNA and generated reads of average length 12 kb, with 50% of the reads between 15.5 and 50 kb in length. Next, using the Hierarchical Genome Assembly Process, we assembled the P. falciparum genome de novo and successfully compiled all 14 nuclear chromosomes telomere-to-telomere. We also accurately resolved centromeres [∼90–99% (A + T)] and subtelomeric regions and identified large insertions and duplications that add extra var and rifin genes to the genome, along with smaller structural variants such as homopolymer tract expansions. Overall, we show that amplification-free, long-read sequencing combined with de novo assembly overcomes major challenges inherent to studying the P. falciparum genome. Indeed, this technology may not only identify the polymorphic and repetitive subtelomeric sequences of parasite populations from endemic areas but may also evaluate structural variation linked to virulence, drug resistance and disease transmission.
- Subjects :
- 0106 biological sciences
0301 basic medicine
Cancer genome sequencing
Plasmodium falciparum
Sequence assembly
de novo assembly
Biology
01 natural sciences
Genome
DNA sequencing
Structural variation
Contig Mapping
03 medical and health sciences
AT-biased
Genetics
Molecular Biology
Exome sequencing
Whole genome sequencing
Polymorphism, Genetic
structural variation
Sequence Analysis, DNA
General Medicine
Telomere
Full Papers
3. Good health
030104 developmental biology
long-read sequencing
Genome, Protozoan
010606 plant biology & botany
Reference genome
Subjects
Details
- Language :
- English
- ISSN :
- 17561663 and 13402838
- Volume :
- 23
- Issue :
- 4
- Database :
- OpenAIRE
- Journal :
- DNA Research
- Accession number :
- edsair.doi.dedup.....7552908c7b6474631606fc7adc840156
- Full Text :
- https://doi.org/10.1093/dnares/dsw022