Back to Search Start Over

SeqSero2: Rapid and Improved Salmonella Serotype Determination Using Whole-Genome Sequencing Data.

Authors :
Shaokang Zhang
den Bakker, Hendrik C.
Shaoting Li
Chen, Jessica
Dinsmore, Blake A.
Lane, Charlotte
Lauer, A. C.
Fields, Patricia I.
Xiangyu Deng
Source :
Applied & Environmental Microbiology. Dec2019, Vol. 85 Issue 23, p1-13. 13p.
Publication Year :
2019

Abstract

SeqSero, launched in 2015, is a software tool for Salmonella serotype determination from whole-genome sequencing (WGS) data. Despite its routine use in public health and food safety laboratories in the United States and other countries, the original SeqSero pipeline is relatively slow (minutes per genome using sequencing reads), is not optimized for draft genome assemblies, and may assign multiple serotypes for a strain. Here, we present SeqSero2 (github.com/denglab/Seq- Sero2; denglab.info/SeqSero2), an algorithmic transformation and functional update of the original SeqSero. Major improvements include (i) additional sequence markers for identification of Salmonella species and subspecies and certain serotypes, (ii) a k-mer based algorithm for rapid serotype prediction from raw reads (seconds per genome) and improved serotype prediction from assemblies, and (iii) a targeted assembly approach for specific retrieval of serotype determinants from WGS for serotype prediction, new allele discovery, and prediction troubleshooting. Evaluated using 5,794 genomes representing 364 common U.S. serotypes, including 2,280 human isolates of 117 serotypes from the National Antimicrobial Resistance Monitoring System, SeqSero2 is up to 50 times faster than the original SeqSero while maintaining equivalent accuracy for raw reads and substantially improving accuracy for assemblies. SeqSero2 further suggested that 3% of the tested genomes contained reads from multiple serotypes, indicating a use for contamination detection. In addition to short reads, SeqSero2 demonstrated potential for accurate and rapid serotype prediction directly from long nanopore reads despite base call errors. Testing of 40 nanopore-sequenced genomes of 17 serotypes yielded a single H antigen misidentification. IMPORTANCE Serotyping is the basis of public health surveillance of Salmonella. It remains a first-line subtyping method even as surveillance continues to be transformed by whole-genome sequencing. SeqSero allows the integration of Salmonella serotyping into a whole-genome-sequencing-based laboratory workflow while maintaining continuity with the classic serotyping scheme. SeqSero2, informed by extensive testing and application of SeqSero in the United States and other countries, incorporates important improvements and updates that further strengthen its application in routine and large-scale surveillance of Salmonella by whole-genome sequencing. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00992240
Volume :
85
Issue :
23
Database :
Academic Search Index
Journal :
Applied & Environmental Microbiology
Publication Type :
Academic Journal
Accession number :
140056370
Full Text :
https://doi.org/10.1128/AEM.01746-19