Patrick M. Shih, Matt Nolan, Nicole Tandeau de Marsac, Thierry Laurent, Emmanuel Talla, Seth D. Axen, Kaarina Sivonen, Michael Herdman, Muriel Gugger, Karen W. Davenport, Rosmarie Rippka, Dongying Wu, Amel Latifi, Thérèse Coursin, Jonathan A. Eisen, Fei Cai, David P. Fewer, Tanja Woyke, Cheryl A. Kerfeld, Lynne Goodwin, Alexandra Calteau, Cliff Han, Edward M. Rubin, Department of Plant and Microbial Biology [Berkeley], University of California [Berkeley] (UC Berkeley), University of California (UC)-University of California (UC), US Department of Energy, Joint Genome Institute (JGI), University of California [Davis] (UC Davis), University of California (UC), Laboratoire de chimie bactérienne (LCB), Aix Marseille Université (AMU)-Centre National de la Recherche Scientifique (CNRS), Department of Food and Environmental Sciences, Helsingin yliopisto = Helsingfors universitet = University of Helsinki, Génomique métabolique (UMR 8030), Genoscope - Centre national de séquençage [Evry] (GENOSCOPE), Université Paris-Saclay-Direction de Recherche Fondamentale (CEA) (DRF (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-Direction de Recherche Fondamentale (CEA) (DRF (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université d'Évry-Val-d'Essonne (UEVE)-Centre National de la Recherche Scientifique (CNRS), Collection des Cyanobactéries, Institut Pasteur [Paris] (IP), Los Alamos National Laboratory (LANL), The work conducted by the US Department of Energy Joint Genome Institute is supported by the Office of Science of the US Department of Energy under Contract DE-AC02-05CH11 231. Funding was also provided by the Institut Pasteur, and the Centre National de la Recherche Scientifique Unité de Recherche Associée 2172 are acknowledged for funding. P.M.S. and C.A.K. werealso supported by National Science Foundation Grant MCB-0851070. All cyanobacteria represented by genomes sequenced in this study are available from the Institut Pasteur, University of California [Berkeley], University of California-University of California, University of California, University of Helsinki, and Institut Pasteur [Paris]
International audience; The cyanobacterial phylum encompasses oxygenic photosynthetic prokaryotes of a great breadth of morphologies and ecologies; they play key roles in global carbon and nitrogen cycles. The chloroplasts of all photosynthetic eukaryotes can trace their ancestry to cyanobacteria. Cyanobacteria also attract considerable interest as platforms for " green " biotechnology and biofuels. To explore the molecular basis of their different phenotypes and biochemical capabilities, we sequenced the genomes of 54 phyloge-netically and phenotypically diverse cyanobacterial strains. Comparison of cyanobacterial genomes reveals the molecular basis for many aspects of cyanobacterial ecophysiological diversity, as well as the convergence of complex morphologies without the acquisition of novel proteins. This phylum-wide study highlights the benefits of diversity-driven genome sequencing, identifying more than 21,000 cyanobacterial proteins with no detectable similarity to known proteins, and foregrounds the diversity of light-harvesting proteins and gene clusters for secondary metabolite biosynthesis. Additionally, our results provide insight into the distribution of genes of cyanobacterial origin in eukaryotic nuclear genomes. Moreover, this study doubles both the amount and the phylogenetic diversity of cyanobacterial genome sequence data. Given the exponentially growing number of sequenced genomes, this diversity-driven study demonstrates the perspective gained by comparing disparate yet related genomes in a phylum-wide context and the insights that are gained from it.