Back to Search
Start Over
Gene annotation and network inference by phylogenetic profiling
- Source :
- BMC Bioinformatics, Vol 7, Iss 1, p 80 (2006), BMC Bioinformatics
- Publication Year :
- 2006
- Publisher :
- BMC, 2006.
-
Abstract
- BackgroundPhylogenetic analysis is emerging as one of the most informative computational methods for the annotation of genes and identification of evolutionary modules of functionally related genes. The effectiveness with which phylogenetic profiles can be utilized to assign genes to pathways depends on an appropriate measure of correlation between gene profiles, and an effective decision rule to use the correlate. Current methods, though useful, perform at a level well below what is possible, largely because performance of the latter deteriorates rapidly as coverage increases.ResultsWe introduce, test and apply a new decision rule, correlation enrichment (CE), for assigning genes to functional categories at various levels of resolution. Among the results are: (1) CE performs better than standard guilt by association (SGA, assignment to a functional category when a simple correlate exceeds a pre-specified threshold) irrespective of the number of genes assigned (i.e.coverage); improvement is greatest at high coverage where precision (positive predictive value) of CE is approximately 6-fold higher than that of SGA. (2) CE is estimated to allocate each of the 2918 unannotated orthologs to KEGG pathways with an average precision of 49% (approximately 7-fold higher than SGA) (3) An estimated 94% of the 1846 unannotated orthologs in the COG ontology can be assigned a function with an average precision of 0.4 or greater. (4) Dozens of functional and evolutionarily conserved cliques or quasi-cliques can be identified, many having previously unannotated genes.ConclusionThe method serves as a general computational tool for annotating large numbers of unknown genes, uncovering evolutionary and functional modules. It appears to perform substantially better than extant stand alone high throughout methods.
- Subjects :
- Inference
Computational biology
Biology
lcsh:Computer applications to medicine. Medical informatics
Biochemistry
Structural Biology
Phylogenetics
Molecular Biology
lcsh:QH301-705.5
Phylogeny
Oligonucleotide Array Sequence Analysis
Genetics
Phylogenetic tree
Methodology Article
Gene Expression Profiling
Applied Mathematics
Sequence Analysis, DNA
Decision rule
Gene Annotation
Computer Science Applications
Gene expression profiling
Gene Expression Regulation
lcsh:Biology (General)
lcsh:R858-859.7
Phylogenetic profiling
DNA microarray
Algorithms
Signal Transduction
Subjects
Details
- Language :
- English
- ISSN :
- 14712105
- Volume :
- 7
- Issue :
- 1
- Database :
- OpenAIRE
- Journal :
- BMC Bioinformatics
- Accession number :
- edsair.doi.dedup.....3cef544e455978ba94921ab132e837e1