Back to Search Start Over

CleanSeq: A Pipeline for Contamination Detection, Cleanup, and Mutation Verifications from Microbial Genome Sequencing Data

Authors :
Caiyan Wang
Yang Xia
Yunfei Liu
Chen Kang
Nan Lu
Di Tian
Hui Lu
Fuhai Han
Jian Xu
Tetsuya Yomo
Source :
Applied Sciences, Vol 12, Iss 12, p 6209 (2022)
Publication Year :
2022
Publisher :
MDPI AG, 2022.

Abstract

Contaminations frequently occur in bacterial cultures, which significantly affect the reproducibility and reliability of the results from whole-genome sequencing (WGS). Decontaminated WGS data with clean reads is the only desirable source for detecting possible variants correctly. Improvements in bioinformatics are essential to analyze the contaminated WGS dataset. Existing pipelines usually contain contamination detection, decontamination, and variant calling separately. The efficiency and results from existing pipelines fluctuate since distinctive computational models and parameters are applied. It is then promising to develop a bioinformatical tool containing functions to discriminate and remove contaminated reads and improve variant calling from clean reads. In this study, we established a Python-based pipeline named CleanSeq for automatic detection and removal of contaminating reads, analyzing possible genome variants with proper verifications via local re-alignments. The application and reproducibility are proven in either simulated, publicly available datasets or actual genome sequencing reads from our experimental evolution study in Escherichia coli. We successfully obtained decontaminated reads, called out all seven consistent mutations from the contaminated bacterial sample, and derived five colonies. Collectively, the results demonstrated that CleanSeq could effectively process the contaminated samples to achieve decontaminated reads, based on which reliable results (i.e., variant calling) could be obtained.

Details

Language :
English
ISSN :
12126209 and 20763417
Volume :
12
Issue :
12
Database :
Directory of Open Access Journals
Journal :
Applied Sciences
Publication Type :
Academic Journal
Accession number :
edsdoj.290b53ba7d849d4a99e6cb102c09df1
Document Type :
article
Full Text :
https://doi.org/10.3390/app12126209