51. The CAZy Database/the Carbohydrate-Active Enzyme (CAZy) Database: Principles and Usage Guidelines
- Author
-
Nicolas Terrapon, Elodie Drula, Bernard Henrissat, Pedro M. Coutinho, Vincent Lombard, Architecture et fonction des macromolécules biologiques (AFMB), Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Institut National de la Recherche Agronomique (INRA), and Institut National de la Recherche Agronomique (INRA)-Aix Marseille Université (AMU)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
0301 basic medicine ,CAZy ,Database ,Protein family ,Computer science ,[SDV]Life Sciences [q-bio] ,030106 microbiology ,Protein domain ,Protein Data Bank (RCSB PDB) ,computer.file_format ,computer.software_genre ,Protein Data Bank ,03 medical and health sciences ,Annotation ,030104 developmental biology ,Protein sequencing ,GenBank ,computer ,ComputingMilieux_MISCELLANEOUS - Abstract
Carbohydrate-Active enZymes (CAZymes) assemble, breakdown, and modify glycans and glycoconjugates using their catalytic and binding modules (functional protein domains). The CAZy database offers since 1998 an online and continuously updated classification of CAZyme modules (Lombard et al. 2014). Each module family in the CAZy classification has been created based on experimentally characterized protein modules from the literature, and the families are populated by related module sequences from public protein sequence databases. Since no universal threshold allows the systematic classification of the various CAZyme families, CAZy annotations result from an expert combination of module modeling/calibration and human curation. CAZy annotations are made publicly available for all proteins released by GenBank (Benson et al. 2012), Swiss-Prot (Boutet et al. 2016) and the Protein Data Bank (PDB; http://www.rcsb.org; (Berman et al. 2000)). Further, functional and 3-D structural information, curated from the literature on a regular basis, constitute essential added values to the CAZy annotation. In this spirit, the display of ligand information from crystallographic complexes has been recently developed (Lombard et al. 2014). This chapter will guide the reader through the usage of CAZy to search enzyme annotations. It will also answer frequent questions such as (i) how to obtain CAZy annotations for a specific protein, a genome, or a metagenome, (ii) how to have a newly characterized family included in the CAZy classification scheme, (iii) why CAZy does not cover all protein families related to glycans/glycoconjugates, and (iv) why CAZy does not transfer functional annotation to similar sequences. Finally, we present here a recent CAZy-associated tool, namely, the Polysaccharide Utilization Loci (PUL) predictor and database in Bacteroidetes species (Terrapon et al. 2015).
- Published
- 2017
- Full Text
- View/download PDF