Back to Search Start Over

Automatic, context-specific generation of Gene Ontology slims

Authors :
Sehgal Muhammad
Davis Melissa J
Ragan Mark A
Source :
BMC Bioinformatics, Vol 11, Iss 1, p 498 (2010)
Publication Year :
2010
Publisher :
BMC, 2010.

Abstract

Abstract Background The use of ontologies to control vocabulary and structure annotation has added value to genome-scale data, and contributed to the capture and re-use of knowledge across research domains. Gene Ontology (GO) is widely used to capture detailed expert knowledge in genomic-scale datasets and as a consequence has grown to contain many terms, making it unwieldy for many applications. To increase its ease of manipulation and efficiency of use, subsets called GO slims are often created by collapsing terms upward into more general, high-level terms relevant to a particular context. Creation of a GO slim currently requires manipulation and editing of GO by an expert (or community) familiar with both the ontology and the biological context. Decisions about which terms to include are necessarily subjective, and the creation process itself and subsequent curation are time-consuming and largely manual. Results Here we present an objective framework for generating customised ontology slims for specific annotated datasets, exploiting information latent in the structure of the ontology graph and in the annotation data. This framework combines ontology engineering approaches, and a data-driven algorithm that draws on graph and information theory. We illustrate this method by application to GO, generating GO slims at different information thresholds, characterising their depth of semantics and demonstrating the resulting gains in statistical power. Conclusions Our GO slim creation pipeline is available for use in conjunction with any GO-annotated dataset, and creates dataset-specific, objectively defined slims. This method is fast and scalable for application to other biomedical ontologies.

Details

Language :
English
ISSN :
14712105
Volume :
11
Issue :
1
Database :
Directory of Open Access Journals
Journal :
BMC Bioinformatics
Publication Type :
Academic Journal
Accession number :
edsdoj.b8de12f69d6a4ceda0aebda641c41bfd
Document Type :
article
Full Text :
https://doi.org/10.1186/1471-2105-11-498