Back to Search
Start Over
UniProtKB amid the turmoil of plant proteomics research
- Source :
- Frontiers in Plant Science; Vol 3, Frontiers in Plant Science, Frontiers in Plant Science, Vol 3 (2012)
- Publication Year :
- 2012
- Publisher :
- Frontiers Media SA, 2012.
-
Abstract
- The UniProt KnowledgeBase (UniProtKB) provides a single, centralized, authoritative resource for protein sequences and functional information. The majority of its records is based on automatic translation of coding sequences (CDS) provided by submitters at the time of initial deposition to the nucleotide sequence databases (INSDC). This article will give a general overview of the current situation, with some specific illustrations extracted from our annotation of Arabidopsis and rice proteomes. More and more frequently, only the raw sequence of a complete genome is deposited to the nucleotide sequence databases and the gene model predictions and annotations are kept in separate, specialized model organism databases (MODs). In order to be able to provide the complete proteome of model organisms, UniProtKB had to implement pipelines for import of protein sequences from Ensembl and EnsemblGenomes. A single genome can be the target of several unrelated sequencing projects and the final assembly and gene model predictions may diverge quite significantly. In addition, several cultivars of the same species are often sequenced - 1001 Arabidopsis cultivars are currently under way - and the resulting proteomes are far from being identical. Therefore, one challenge for UniProtKB is to store and organize these data in a convenient way and to clearly defined reference proteomes that should be made available to users. Manual annotation is one of the landmarks of the Swiss-Prot section of UniProtKB. Besides adding functional annotation, curators are checking, and often correcting, gene model predictions. For plants, this task is limited to Arabidopsis thaliana and Oryza sativa subsp. japonica. Proteomics data providing experimental evidences confirming the existence of proteins or identifying sequence features such as post-translational modifications are also imported into UniProtKB records and the knowledgebase is cross-referenced to numerous proteomics resource.
- Subjects :
- 0106 biological sciences
Review Article
Plant Science
Computational biology
lcsh:Plant culture
Biology
Proteomics
01 natural sciences
Genome
03 medical and health sciences
Annotation
proteomics
Arabidopsis
complete proteome
Ensembl
lcsh:SB1-1110
genome
030304 developmental biology
Genetics
0303 health sciences
Nucleic acid sequence
food and beverages
biology.organism_classification
knowledgebase
Proteome
UniProt
protein
010606 plant biology & botany
Subjects
Details
- ISSN :
- 1664462X
- Volume :
- 3
- Database :
- OpenAIRE
- Journal :
- Frontiers in Plant Science
- Accession number :
- edsair.doi.dedup.....0ffb4ddfc9a5ee4e27394377caabacf7
- Full Text :
- https://doi.org/10.3389/fpls.2012.00270