Back to Search Start Over

A Bioinformatic Pipeline for Improved Genome Analysis and Clustering of Isolates during Outbreaks of Legionnaires' Disease

Authors :
Pascal Lapierre
Wolfgang Haas
Kimberlee A. Musser
Source :
J Clin Microbiol
Publication Year :
2020

Abstract

Legionnaires’ disease, a severe lung infection caused by the bacterium Legionella pneumophila, occurs as single cases or in outbreaks that are actively tracked by public health departments. To determine the point source of an outbreak, clinical isolates need to be compared to environmental samples to find matching isolates. One confounding factor is the genome plasticity of L. pneumophila, making an exact sequence comparison by whole-genome sequencing (WGS) challenging. Here, we present a WGS analysis pipeline, LegioCluster, that is designed to circumvent this problem by automatically selecting the best matching reference genome prior to mapping and variant calling. This approach reduces the number of false-positive variant calls, maximizes the fraction of all genomes that are being compared, and naturally clusters the isolates according to their reference strain. Isolates that are too distant from any genome in the database are added to the list of candidate references, thereby creating a new cluster. Short insertions or deletions are considered in addition to single-nucleotide polymorphisms for increased discriminatory power. This manuscript describes the use of this automated and “locked down” bioinformatic pipeline deployed at the New York State Department of Health’s Wadsworth Center for investigating relatedness between clinical and environmental isolates. A similar pipeline has not been widely available for use to support these critically important public health investigations.

Details

ISSN :
1098660X
Volume :
59
Issue :
2
Database :
OpenAIRE
Journal :
Journal of clinical microbiology
Accession number :
edsair.doi.dedup.....64d3bfc260b22f69046a606084e09e3c