Back to Search Start Over

Influenza Classification from Short Reads with VAPOR Facilitates Robust Mapping Pipelines and Zoonotic Strain Detection for Routine Surveillance Applications

Authors :
Thomas R. Connor
Benjamin Southgate
S. Corden
Catherine Moore
Joel Southgate
Matthew J. Bull
Joanne Watkins
Clare M. Brown
Source :
Bioinformatics
Publication Year :
2019
Publisher :
Cold Spring Harbor Laboratory, 2019.

Abstract

BackgroundInfluenza viruses are associated with a significant global public health burden. The segmented RNA genome of influenza changes continually due to mutation, and the accumulation of these changes within the antigenic recognition sites of haemagglutinin (HA) and neuraminidase (NA) in turn leads to annual epidemics. Influenza A is also zoonotic, allowing for exchange of segments between human and non-human viruses, resulting in new strains with pandemic potential. These processes necessitate a global surveillance system for influenza monitoring. To this end, whole-genome sequencing (WGS) has begun to emerge as a useful tool. However, due to the diversity and mutability of the influenza genome, and noise in short-read data, bioinformatics processing can present challenges.ResultsConventional mapping approaches can be insufficient when a sub-optimal reference strain is chosen. For short-read datasets simulated from influenza H1N1 HA sequences, read recovery after single-reference mapping was routinely as low as 90% for human-origin influenza sequences, and often lower than 10% for those from avian hosts. To this end, we developed adeBruijn Graph (DBG)-based classifier of influenza WGS datasets: VAPOR. In real data benchmarking using 257 WGS read sets with correspondingde novoassemblies, VAPOR provided classifications for all samples with a mean of >99.8% identity to assembled contigs. This resulted in an increase in the number of mapped reads by 6.8% on average, up to a maximum of 13.3%. Additionally, using simulations, we demonstrate that classification from reads may be applied to detection of reassorted strains.ConclusionsVAPOR has potential to simplify bioinformatics pipelines for surveillance, providing a novel method for detection of influenza strains of human and non-human origin directly from reads, minimization of potential data loss and bias associated with conventional mapping, and allowing visualization of alignments that would otherwise require slowde novoassembly. Whilst with expertise and time these pitfalls can largely be avoided, with pre-classification they are remedied in a single step. Furthermore, our algorithm could be adapted in future to surveillance of other RNA viruses. VAPOR is available athttps://github.com/connor-lab/vapor.

Details

ISSN :
13674803
Database :
OpenAIRE
Journal :
Bioinformatics
Accession number :
edsair.doi.dedup.....cd890cdd354f3f424c25b378e3c35a3d