1. Application of an optimized annotation pipeline to the Cryptococcus deuterogattii genome reveals dynamic primary metabolic gene clusters and genomic impact of RNAi loss
- Author
-
Gröhs Ferrareze, Patrícia Aline, Maufrais, Corinne, Silva Araujo Streit, Rodrigo, Priest, Shelby, Cuomo, Christina, Heitman, Joseph, Staats, Charley Christian, Janbon, Guilhem, Biologie des ARN des Pathogènes fongiques - RNA Biology of Fungal Pathogens, Institut Pasteur [Paris] (IP)-Université Paris Cité (UPCité), Universidade Federal do Rio Grande do Sul [Porto Alegre] (UFRGS), Hub Bioinformatique et Biostatistique - Bioinformatics and Biostatistics HUB, Duke University Medical Center, Broad Institute of MIT and Harvard (BROAD INSTITUTE), Harvard Medical School [Boston] (HMS)-Massachusetts Institute of Technology (MIT)-Massachusetts General Hospital [Boston], PAGF was supported by a CAPES exchange grant (Advanced Network of Computational Biology—RABICÓ – Grant no. 23038.010041/2013-13). This work was supported by a CAPES COFECUB grant n°39712ZK to GJ, by a CNPq grant (309897/2017-3) to CCS, and by a NIH/NIAID R37 MERIT Award AI39115-23 and a NIH/NIAID R01 Award AI50113-16 to JH. JH is co-director and fellow of CIFAR program Fungal Kingdom: Threats & Opportunities. SJP was supported by the NIH/NIAID F31 Fellowship 1F31AI143136-02., and Members of the Heitman laboratory are acknowledged for valuable discussion.
- Subjects
AcademicSubjects/SCI01140 ,AcademicSubjects/SCI00010 ,genome annotation pipeline ,MESH: Genomics ,[SDV]Life Sciences [q-bio] ,MESH: RNA Interference ,MESH: Cryptococcus neoformans ,Molecular Sequence Annotation ,MESH: Molecular Sequence Annotation ,Genomics ,Cryptococus deuterogattii ,AcademicSubjects/SCI01180 ,metabolic gene cluster ,RNAi ,Multigene Family ,MESH: Genome, Fungal ,Fungal Genetics and Genomics ,Cryptococcus neoformans ,MESH: Multigene Family ,AcademicSubjects/SCI00960 ,RNA Interference ,Genome, Fungal - Abstract
International audience; Evaluating the quality of a de novo annotation of a complex fungal genome based on RNA-seq data remains a challenge. In this study, we sequentially optimized a Cufflinks-CodingQuary-based bioinformatics pipeline fed with RNA-seq data using the manually annotated model pathogenic yeasts Cryptococcus neoformans and Cryptococcus deneoformans as test cases. Our results show that the quality of the annotation is sensitive to the quantity of RNA-seq data used and that the best quality is obtained with 5–10 million reads per RNA-seq replicate. We also showed that the number of introns predicted is an excellent a priori indicator of the quality of the final de novo annotation. We then used this pipeline to annotate the genome of the RNAi-deficient species Cryptococcus deuterogattii strain R265 using RNA-seq data. Dynamic transcriptome analysis revealed that intron retention is more prominent in C. deuterogattii than in the other RNAi-proficient species C. neoformans and C. deneoformans. In contrast, we observed that antisense transcription was not higher in C. deuterogattii than in the two other Cryptococcus species. Comparative gene content analysis identified 21 clusters enriched in transcription factors and transporters that have been lost. Interestingly, analysis of the subtelomeric regions in these three annotated species identified a similar gene enrichment, reminiscent of the structure of primary metabolic clusters. Our data suggest that there is active exchange between subtelomeric regions, and that other chromosomal regions might participate in adaptive diversification of Cryptococcus metabolite assimilation potential.
- Published
- 2021
- Full Text
- View/download PDF