Back to Search
Start Over
CSSSCL: a python package that uses combined sequence similarity scores for accurate taxonomic classification of long and short sequence reads
- Source :
- Bioinformatics
- Publication Year :
- 2015
- Publisher :
- Oxford University Press, 2015.
-
Abstract
- Summary: Sequence comparison of genetic material between known and unknown organisms plays a crucial role in genomics, metagenomics and phylogenetic analysis. The emerging long-read sequencing technologies can now produce reads of tens of kilobases in length that promise a more accurate assessment of their origin. To facilitate the classification of long and short DNA sequences, we have developed a Python package that implements a new sequence classification model that we have demonstrated to improve the classification accuracy when compared with other state of the art classification methods. For the purpose of validation, and to demonstrate its usefulness, we test the combined sequence similarity score classifier (CSSSCL) using three different datasets, including a metagenomic dataset composed of short reads. Availability and implementation: Package’s source code and test datasets are available under the GPLv3 license at https://github.com/oicr-ibc/cssscl. Contact: ivan.borozan@oicr.on.ca Supplementary information: Supplementary data are available at Bioinformatics online.
- Subjects :
- 0301 basic medicine
Statistics and Probability
Source code
media_common.quotation_subject
Genomics
Sequence alignment
Biology
computer.software_genre
Biochemistry
DNA sequencing
03 medical and health sciences
Software
Molecular Biology
Alignment-free sequence analysis
Phylogeny
media_common
Bacteria
business.industry
Sequence Analysis, DNA
Models, Theoretical
Applications Notes
Computer Science Applications
Computational Mathematics
030104 developmental biology
Computational Theory and Mathematics
Metagenomics
Viruses
Data mining
business
Classifier (UML)
computer
Sequence Analysis
Sequence Alignment
Algorithms
Subjects
Details
- Language :
- English
- ISSN :
- 13674811 and 13674803
- Volume :
- 32
- Issue :
- 3
- Database :
- OpenAIRE
- Journal :
- Bioinformatics
- Accession number :
- edsair.doi.dedup.....222f60b7c21a03208b8730557d5cf87a