Back to Search
Start Over
Dadasnake, a Snakemake implementation of DADA2 to process amplicon sequencing data for microbial ecology
- Source :
- GigaScience
- Publication Year :
- 2020
-
Abstract
- Background Amplicon sequencing of phylogenetic marker genes, e.g., 16S, 18S, or ITS ribosomal RNA sequences, is still the most commonly used method to determine the composition of microbial communities. Microbial ecologists often have expert knowledge on their biological question and data analysis in general, and most research institutes have computational infrastructures to use the bioinformatics command line tools and workflows for amplicon sequencing analysis, but requirements of bioinformatics skills often limit the efficient and up-to-date use of computational resources. Results We present dadasnake, a user-friendly, 1-command Snakemake pipeline that wraps the preprocessing of sequencing reads and the delineation of exact sequence variants by using the favorably benchmarked and widely used DADA2 algorithm with a taxonomic classification and the post-processing of the resultant tables, including hand-off in standard formats. The suitability of the provided default configurations is demonstrated using mock community data from bacteria and archaea, as well as fungi. Conclusions By use of Snakemake, dadasnake makes efficient use of high-performance computing infrastructures. Easy user configuration guarantees flexibility of all steps, including the processing of data from multiple sequencing platforms. It is easy to install dadasnake via conda environments. dadasnake is available at https://github.com/a-h-b/dadasnake.
- Subjects :
- 0106 biological sciences
Computer science
Process (engineering)
AcademicSubjects/SCI02254
microbiome
Health Informatics
computer.software_genre
010603 evolutionary biology
01 natural sciences
03 medical and health sciences
Microbial ecology
RNA, Ribosomal, 16S
Technical Note
denoising
Preprocessor
rRNA gene sequence analysis
Gene
Phylogeny
030304 developmental biology
Flexibility (engineering)
0303 health sciences
biology
Microbiota
High-Throughput Nucleotide Sequencing
pipeline
Biological classification
Ribosomal RNA
biology.organism_classification
Pipeline (software)
Computer Science Applications
Workflow
Amplicon sequencing
AcademicSubjects/SCI00960
Data mining
exact sequence variants
community structure
computer
Bacteria
Software
Archaea
Subjects
Details
- ISSN :
- 2047217X
- Volume :
- 9
- Issue :
- 12
- Database :
- OpenAIRE
- Journal :
- GigaScience
- Accession number :
- edsair.doi.dedup.....10e0430bf0d3b002a98389dd0430681d