Back to Search Start Over

A catalog of reference genomes from the human microbiome

Authors :
Dirk Gevers
Doyle V. Ward
Sean M. Sykes
Justin Johnson
Michael Fitzgerald
Jessica B. Hostetler
Sarah Young
Dianhui Zhu
Jason R. Miller
Chinnappa D. Kodira
Aaron M. Berlin
Erica Sodergren
Michael G. Surette
Matthew D. Pearson
Asif T. Chinwalla
Ashlee M. Earl
Michael Feldgarden
Kim C. Worley
Christian J. Buhay
Lisa Hemphill
Manolito Torralba
Xiang Qin
Jonathan H. Badger
Clint Howarth
Marcus Gillis
Candace N. Farmer
Karen E. Nelson
Bruce W. Birren
Richard A. Gibbs
Sandra W. Clifton
Mike Holder
Aye Wollam
Carsten Russ
Wesley C. Warren
Brian J. Haas
Michelle G. Giglio
Qiang Xu
Richard K. Wilson
Patrick Minx
Scott Durkin
Chad Nusbaum
Craig Pohl
Robert S. Fulton
Steve Ferriera
Chandri N. Yandava
Sharvari Gujja
Qiandong Zeng
Ramana Madupu
Victor Markowitz
Joshua Orvis
Andrew Cree
Chad Tomlinson
Donna M. Muzny
Vandita Joshi
Amrita Pati
Christie Kovar
Theresa A. Hepburn
Nikos C. Kyrpides
Kymberlie H. Pepin
Douglas B. Rusch
Granger G. Sutton
Amr Abouellleil
Konstantinos Liolios
Sarah K. Highlander
Yan Ding
Jamison McCorrison
George M. Weinstock
Kris A. Wetterstrand
Emma Allen-Vercoe
Teena Mehta
Lucinda Fulton
Katarzyna Wilczek-Boney
Makedonka Mitreva
Lei Chen
Joseph F. Petrosino
Heather Huot Creasy
Shannon Dugan
Lan Zhang
Owen White
Robert L. Strausberg
Jennifer R. Wortman
Jonathan Crabtree
Barbara A. Methé
Source :
Science (New York, N.Y.). 328(5981)
Publication Year :
2010

Abstract

The human microbiome refers to the community of microorganisms, including prokaryotes, viruses, and microbial eukaryotes, that populate the human body. The National Institutes of Health launched an initiative that focuses on describing the diversity of microbial species that are associated with health and disease. The first phase of this initiative includes the sequencing of hundreds of microbial reference genomes, coupled to metagenomic sequencing from multiple body sites. Here we present results from an initial reference genome sequencing of 178 microbial genomes. From 547,968 predicted polypeptides that correspond to the gene complement of these strains, previously unidentified ("novel") polypeptides that had both unmasked sequence length greater than 100 amino acids and no BLASTP match to any nonreference entry in the nonredundant subset were defined. This analysis resulted in a set of 30,867 polypeptides, of which 29,987 (approximately 97%) were unique. In addition, this set of microbial genomes allows for approximately 40% of random sequences from the microbiome of the gastrointestinal tract to be associated with organisms based on the match criteria used. Insights into pan-genome analysis suggest that we are still far from saturating microbial species genetic data sets. In addition, the associated metrics and standards used by our group for quality assurance are presented.

Details

ISSN :
10959203
Volume :
328
Issue :
5981
Database :
OpenAIRE
Journal :
Science (New York, N.Y.)
Accession number :
edsair.doi.dedup.....8be674be3fde8b50bfbdf36a3add7e18