1. Immunogenetic sequence annotation based on IMGT-ONTOLOGY
- Author
-
Marie-Paule Lefranc, Géraldine Folch, Fatena Bellahcene, Patrice Duroux, François Ehrenmann, Véronique Giudicelli, and Joumana Jabado-Michaloud
- Subjects
genomic DNA ,Annotation ,Sequence annotation ,Molecular type ,General Materials Science ,Chain type ,Computational biology ,Ontology (information science) ,Biology ,Sequence Ontology - Abstract
IMGT/LIGM-DB^1^ is the first and the largest IMGT^®^ database^2^ in which are managed, analysed and annotated more than 136,000 immunoglobulin (IG) and T cell receptor (TR) nucleotide sequences from human and 235 other vertebrate species (April 2009). The expert annotation of these sequences and the added standardized knowledge are based on IMGT-ONTOLOGY, the first ontology developed in the field of immunogenetics and immunoinformatics.^3^ The annotation of immunogenetic sequences requires important expertise, owing to the unusual structure (non-classical exon/intron structure) of the IG and TR genes and characteristic chain synthesis owing to DNA V-J and V-D-J rearrangements. The way to annotate these sequences depends on the molecular type (gDNA, mRNA, cDNA or protein) and the configuration type (germline or rearranged), and if sequences from the concerned species are present or not in the IMGT reference directory sets. IMGT/V-QUEST^5^ and internal tools (IMGT/Automat, IMGT/LIGMotif, IMGT/BLAST and IMGT/DomainGapAlign) were developed. The first step in annotation allows to identify the chain type (for instance IG-Heavy) and to assign standardized keywords (IDENTIFICATION axiom). The second step is the classification of IG and TR genes and alleles (CLASSIFICATION axiom). The third step is the description (DESCRIPTION axiom) of the V, D, J and C genes and alleles with specific standardized labels. There are more than 590 IMGT standardized labels from which 64 have been entered in Sequence Ontology (SO). The delimitation of the FR-IMGT and CDR-IMGT lengths and the positions of conserved amino acids based on the IMGT unique numbering (NUMEROTATION axiom) allow to bridge the gap between sequences and 3D structures.^6^ The complete annotation of immunogenetic germline (V, D, J) and C sequences is followed by the update of the IMGT Repertoire (IMGT Gene tables, Alignments of alleles, Protein displays, Colliers de Perles, etc.), IMGT® gene database (IMGT/GENE-DB) and IMGT reference directory sets of the IMGT® tools (IMGT/V-QUEST, IMGT/JunctionAnalysis and IMGT/DomainGapAlign).
- Published
- 2009
- Full Text
- View/download PDF