Stephane Pesant, Allan Anthony Kamau, Josep M. Gasol, Guillem Salazar, Susana Agustí, Ramon Massana, Chris Bowler, Intikhab Alam, Matthew B. Sullivan, José M. González, Hiroyuki Ogata, Takashi Gojobori, Jeroen Raes, Pablo Sánchez, Jesús M. Arrieta, Marta Sebastián, Simon Roux, Francisco M. Cornejo-Castillo, Marta Royo-Llonch, Ramiro Logares, Silvia G. Acinas, Carlos M. Duarte, Pascal Hingamp, Lucas Paoli, Shinichi Sunagawa, Carlos Pedrós-Alió, Peer Bork, Dolors Vaqué, Gipsi Lima-Mendez, Institut méditerranéen d'océanologie (MIO), Institut de Recherche pour le Développement (IRD)-Aix Marseille Université (AMU)-Institut national des sciences de l'Univers (INSU - CNRS)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), Institut de Recherche pour le Développement (IRD)-Aix Marseille Université (AMU)-Institut national des sciences de l'Univers (INSU - CNRS)-Centre National de la Recherche Scientifique (CNRS)-Université de Toulon (UTLN), UCL - SST/LIBST - Louvain Institute of Biomolecular Science and Technology, Ministerio de Economía y Competitividad (España), Department of Energy (US), Ministerio de Ciencia, Innovación y Universidades (España), Generalitat de Catalunya, Agencia Estatal de Investigación (España), King Abdullah University of Science and Technology, and European Commission
15 pages, 7 figures, supplementary information https://doi.org/10.1038/s42003-021-02112-2.-- All data generated or analyzed during this study are included in this published article (and its supplementary information files). All raw sequences are publicly available at both DOE’s JGI Integrated Microbial Genomes and Microbiomes (IMG/MER) and the European Nucleotide Archive (ENA). Individual metagenome assemblies, annotation files, and alignment files can be accessed at IMG/MER. All accession numbers are listed in Supplementary Data 1. The co-assembly for the MAG dataset construction can be found through ENA at https://www.ebi.ac.uk/ena with accession number PRJEB40454, the nucleotide sequence for each MAG and their annotation files can be found through BioStudies at https://www.ebi.ac.uk/biostudies with accession S-BSST457 and also in the companion website to this manuscript at https://malaspina-public.gitlab.io/malaspina-deep-ocean-microbiome/.-- All software used in this work is publicly available distributed by their respective developers, and it is described in “Methods”, including the versions and options used. Additional custom scripts to assign taxonomy to the M-geneDB genes and to filter and format FRA results are available through BioStudies at https://www.ebi.ac.uk/biostudies with accession S-BSST457, The deep sea, the largest ocean’s compartment, drives planetary-scale biogeochemical cycling. Yet, the functional exploration of its microbial communities lags far behind other environments. Here we analyze 58 metagenomes from tropical and subtropical deep oceans to generate the Malaspina Gene Database. Free-living or particle-attached lifestyles drive functional differences in bathypelagic prokaryotic communities, regardless of their biogeography. Ammonia and CO oxidation pathways are enriched in the free-living microbial communities and dissimilatory nitrate reduction to ammonium and H2 oxidation pathways in the particle-attached, while the Calvin Benson-Bassham cycle is the most prevalent inorganic carbon fixation pathway in both size fractions. Reconstruction of the Malaspina Deep Metagenome-Assembled Genomes reveals unique non-cyanobacterial diazotrophic bacteria and chemolithoautotrophic prokaryotes. The widespread potential to grow both autotrophically and heterotrophically suggests that mixotrophy is an ecologically relevant trait in the deep ocean. These results expand our understanding of the functional microbial structure and metabolic capabilities of the largest Earth aquatic ecosystem, This work was funded by the Spanish Ministry of Economy and Competitiveness (MINECO) through the Consolider-Ingenio program (Malaspina 2010 Expedition, ref. CSD2008-00077). The sequencing of 58 bathypelagic metagenomes was done by the U.S. Department of Energy Joint Genome Institute, supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02 05CH11231 to SGA (CSP 612 “Microbial metagenomics and transcriptomics from a global deep-ocean expedition”). Additional funding was provided by the project MAGGY (CTM2017-87736-R) to S.G.A. from the Spanish Ministry of Economy and Competitiveness, Grup de Recerca 2017SGR/1568 from Generalitat de Catalunya, and King Abdullah University of Science and Technology (KAUST) under contract OSR #3362 and by funding of the EMFF Program of the European Union (MERCLUB project, Grant Agreement 863584). The ICM researchers have had the institutional support of the “Severo Ochoa Centre of Excellence” accreditation (CEX2019-000928-S). High-Performance computing analyses were run at the Marine Bioinformatics Service (MARBITS, https://marbits.icm.csic.es) of the Institut de Ciències del Mar (ICM-CSIC), Barcelona, Supercomputing Center (Grant BCV-2013-2-0001) and KAUST’s Ibex HPC