1. Taxonomic assignment of uncultivated prokaryotic virus genomes is enabled by gene-sharing networks
- Author
-
Simon Roux, J. Rodney Brister, Ho Bin Jang, Olivier Zablocki, Mart Krupovic, Jens H. Kuhn, Evelien M. Adriaenssens, Matthew B. Sullivan, Andrew M. Kropinski, Rob Lavigne, Dann Turner, Benjamin Bolduc, Ohio State University [Columbus] (OSU), Integrated Research Facility at Fort Detrick (IRF-Frederick), National Institute of Allergy and Infectious Diseases [Bethesda] (NIAID-NIH), National Institutes of Health [Bethesda] (NIH)-National Institutes of Health [Bethesda] (NIH), DOE Joint Genome Institute [Walnut Creek], U.S. Department of Energy [Washington] (DOE), Institute of Integrative Biology [Liverpool, UK], University of Liverpool, Quadram Institute, National Center for Biotechnology Information (NCBI), National Institutes of Health [Bethesda] (NIH), Ontario Veterinary College [Univ. Guelph, Canada], University of Guelph, Biologie Moléculaire du Gène chez les Extrêmophiles (BMGE), Institut Pasteur [Paris], Faculty of BioScience Engineering [KU Leuven, Belgium], Catholic University of Leuven - Katholieke Universiteit Leuven (KU Leuven), University of the West of England [Bristol] (UWE Bristol), High-performance computational support was provided as an award from the Ohio Supercomputer Center to M.B.S. Funding was provided in part by the Department of Energy’s Genome Sciences Program Soil Microbiome Scientific Focus Area award (no. SCW1632) to Lawrence Livermore National Laboratory, NSF Biological Oceanography awards (OCE no. 1536989 and OCE no. 1756314), and a Gordon and Betty Moore Foundation Investigator Award (no. 3790) to M.B.S. Funding was provided to J.R.B. by the Intramural Research Program of the National Institutes of Health (NIH) National Library of Medicine. The work conducted by the US Department of Energy Joint Genome Institute is supported by the Office of Science of the US Department of Energy under Contract DE-AC02-05CH11231 to S.R. This work was funded in part through Battelle Memorial Institute’s prime contract with the US National Institute of Allergy and Infectious Diseases (NIAID) under Contract no. HHSN272200700016I to J.H.K. The content of this publication does not necessarily reflect the views or policies of the US Department of Health and Human Services or of the institutions and companies affiliated with the authors., We thank L. Bollinger, G. Trubl and I. Tolstoy for their comments on improving the manuscript, as well as Z.-Q. You for helping push the network analytics., Biotechnology and Biological Sciences Research Council (BBSRC), and Institut Pasteur [Paris] (IP)
- Subjects
MESH: Genome, Viral/genetics ,Classification and taxonomy ,MESH: Viruses/classification ,MESH: Gene Regulatory Networks/genetics ,Applied Microbiology and Biotechnology ,Genome ,0302 clinical medicine ,RefSeq ,TOOL ,Bacteriophages ,Gene Regulatory Networks ,Viral ,MESH: Phylogeny ,Phylogeny ,0303 health sciences ,MESH: Metagenomics ,Classification ,MESH: Classification ,Viruses ,[SDV.MP.VIR]Life Sciences [q-bio]/Microbiology and Parasitology/Virology ,Molecular Medicine ,ICTV BACTERIAL ,POPULATIONS ,Taxonomy (biology) ,MESH: Prokaryotic Cells/virology ,MESH: Viruses/genetics ,Centre for Research in Biosciences ,MESH: Bacteriophages/genetics ,Life Sciences & Biomedicine ,Biotechnology ,Biomedical Engineering ,Bioengineering ,PHAGE ,Computational biology ,Genome, Viral ,Biology ,CLASSIFICATION ,03 medical and health sciences ,MD Multidisciplinary ,Human virome ,Virus classification ,030304 developmental biology ,virus classification, archaeal and bacterial virus taxonomy, scalable framework, gene-sharing networks ,Science & Technology ,ARCHAEAL VIRUSES ,Hierarchical clustering ,Computational biology and bioinformatics ,Prokaryotic Cells ,MESH: Metagenome/genetics ,Biotechnology & Applied Microbiology ,Metagenomics ,EVOLUTIONARY ,Metagenome ,UPDATE ,ECOGENOMICS ,Bacterial virus ,030217 neurology & neurosurgery ,Software ,GENERATION - Abstract
Microbiomes from every environment contain a myriad of uncultivated archaeal and bacterial viruses, but studying these viruses is hampered by the lack of a universal, scalable taxonomic framework. We present vConTACT v.2.0, a network-based application utilizing whole genome gene-sharing profiles for virus taxonomy that integrates distance-based hierarchical clustering and confidence scores for all taxonomic predictions. We report near-identical (96%) replication of existing genus-level viral taxonomy assignments from the International Committee on Taxonomy of Viruses for National Center for Biotechnology Information virus RefSeq. Application of vConTACT v.2.0 to 1,364 previously unclassified viruses deposited in virus RefSeq as reference genomes produced automatic, high-confidence genus assignments for 820 of the 1,364. We applied vConTACT v.2.0 to analyze 15,280 Global Ocean Virome genome fragments and were able to provide taxonomic assignments for 31% of these data, which shows that our algorithm is scalable to very large metagenomic datasets. Our taxonomy tool can be automated and applied to metagenomes from any environment for virus classification. ispartof: NATURE BIOTECHNOLOGY vol:37 issue:6 pages:632-+ ispartof: location:United States status: published
- Published
- 2019
- Full Text
- View/download PDF