Back to Search
Start Over
chewBBACA: A complete suite for gene-by-gene schema creation and strain identification
- Source :
- Microbial Genomics
- Publication Year :
- 2018
- Publisher :
- Microbiology Society, 2018.
-
Abstract
- Gene-by-gene approaches are becoming increasingly popular in bacterial genomic epidemiology and outbreak detection. However, there is a lack of open-source scalable software for schema definition and allele calling for these methodologies. The chewBBACA suite was designed to assist users in the creation and evaluation of novel whole-genome or core-genome gene-by-gene typing schemas and subsequent allele calling in bacterial strains of interest. The software can run in a laptop or in high performance clusters making it useful for both small laboratories and large reference centers. ChewBBACA is available athttps://github.com/B-UMMI/chewBBACAor as a docker image athttps://hub.docker.com/r/ummidock/chewbbaca/.DATA SUMMARYAssembled genomes used for the tutorial were downloaded from NCBI in August 2016 by selecting those submitted asStreptococcus agalactiaetaxon or sub-taxa. All the assemblies have been deposited as a zip file in FigShare (https://figshare.com/s/9cbe1d422805db54cd52), where a file with the original ftp link for each NCBI directory is also available.Code for the chewBBACA suite is available athttps://github.com/B-UMMI/chewBBACAwhile the tutorial example is found athttps://github.com/B-UMMI/chewBBACA_tutorial.I/We confirm all supporting data, code and protocols have been provided within the article or through supplementary data files. ⊠IMPACT STATEMENTThe chewBBACA software offers a computational solution for the creation, evaluation and use of whole genome (wg) and core genome (cg) multilocus sequence typing (MLST) schemas. It allows researchers to develop wg/cgMLST schemes for any bacterial species from a set of genomes of interest. The alleles identified by chewBBACA correspond to potential coding sequences, possibly offering insights into the correspondence between the genetic variability identified and phenotypic variability. The software performs allele calling in a matter of seconds to minutes per strain in a laptop but is easily scalable for the analysis of large datasets of hundreds of thousands of strains using multiprocessing options. The chewBBACA software thus provides an efficient and freely available open source solution for gene-by-gene methods. Moreover, the ability to perform these tasks locally is desirable when the submission of raw data to a central repository or web services is hindered by data protection policies or ethical or legal concerns.
- Subjects :
- 0301 basic medicine
business.product_category
Microbial Evolution and Epidemiology: Population Genomics
Computer science
030106 microbiology
Methods Paper
multilocus sequence typing
Genome
Polymorphism, Single Nucleotide
World Wide Web
03 medical and health sciences
Software
schema
Schema (psychology)
Allele
Gene
1183 Plant biology, microbiology, virology
Alleles
computer.programming_language
030304 developmental biology
0303 health sciences
business.industry
030306 microbiology
Suite
Strain (biology)
allele calling
General Medicine
Python (programming language)
gene-by-gene
3142 Public health care science, environmental and occupational health
Identification (information)
030104 developmental biology
ComputingMethodologies_PATTERNRECOGNITION
chewBBACA
Genetic Loci
Laptop
Scalability
Software engineering
business
computer
Algorithms
Genome, Bacterial
Subjects
Details
- Language :
- English
- ISSN :
- 20575858
- Volume :
- 4
- Issue :
- 3
- Database :
- OpenAIRE
- Journal :
- Microbial Genomics
- Accession number :
- edsair.doi.dedup.....90b962e39df15bda62560a6583634a53