Back to Search
Start Over
De novo genome assembly of Solanum sitiens reveals structural variation associated with drought and salinity tolerance
- Source :
- Bioinformatics
- Publication Year :
- 2021
- Publisher :
- Oxford University Press, 2021.
-
Abstract
- Motivation Solanum sitiens is a self-incompatible wild relative of tomato, characterized by salt and drought-resistance traits, with the potential to contribute through breeding programmes to crop improvement in cultivated tomato. This species has a distinct morphology, classification and ecotype compared to other stress resistant wild tomato relatives such as S.pennellii and S.chilense. Therefore, the availability of a reference genome for S.sitiens will facilitate the genetic and molecular understanding of salt and drought resistance. Results A high-quality de novo genome and transcriptome assembly for S.sitiens (Accession LA1974) has been developed. A hybrid assembly strategy was followed using Illumina short reads (∼159× coverage) and PacBio long reads (∼44× coverage), generating a total of ∼262 Gbp of DNA sequence. A reference genome of 1245 Mbp, arranged in 1483 scaffolds with an N50 of 1.826 Mbp was generated. Genome completeness was estimated at 95% using the Benchmarking Universal Single-Copy Orthologs (BUSCO) and the K-mer Analysis Tool (KAT). In addition, ∼63 Gbp of RNA-Seq were generated to support the prediction of 31 164 genes from the assembly, and to perform a de novo transcriptome. Lastly, we identified three large inversions compared to S.lycopersicum, containing several drought-resistance-related genes, such as beta-amylase 1 and YUCCA7. Availability and implementation S.sitiens (LA1974) raw sequencing, transcriptome and genome assembly have been deposited at the NCBI’s Sequence Read Archive, under the BioProject number ‘PRJNA633104’. All the commands and scripts necessary to generate the assembly are available at the following github repository: https://github.com/MCorentin/Solanum_sitiens_assembly. Supplementary information Supplementary data are available at Bioinformatics online.
- Subjects :
- 0106 biological sciences
Statistics and Probability
AcademicSubjects/SCI01060
drought resistance
Sequence assembly
Solanum sitiens
Biology
01 natural sciences
Biochemistry
Genome
DNA sequencing
Structural variation
Transcriptome
03 medical and health sciences
Wild tomato
Molecular Biology
Gene
030304 developmental biology
Genetics
0303 health sciences
food and beverages
biology.organism_classification
Genome Analysis
Original Papers
S.lycopersicum
Computer Science Applications
Computational Mathematics
Computational Theory and Mathematics
genome assembly
010606 plant biology & botany
Reference genome
Subjects
Details
- Language :
- English
- ISSN :
- 13674811 and 13674803
- Volume :
- 37
- Issue :
- 14
- Database :
- OpenAIRE
- Journal :
- Bioinformatics
- Accession number :
- edsair.doi.dedup.....0550707f1a8668e3a93fa5748c6e7357