Back to Search
Start Over
Judging the Quality of Gene Expression-Based Clustering Methods Using Gene Annotation
- Source :
- Genome Research. 12:1574-1581
- Publication Year :
- 2002
- Publisher :
- Cold Spring Harbor Laboratory, 2002.
-
Abstract
- We compare several commonly used expression-based gene clustering algorithms using a figure of merit based on the mutual information between cluster membership and known gene attributes. By studying various publicly available expression data sets we conclude that enrichment of clusters for biological function is, in general, highest at rather low cluster numbers. As a measure of dissimilarity between the expression patterns of two genes, no method outperforms Euclidean distance for ratio-based measurements, or Pearson distance for non-ratio-based measurements at the optimal choice of cluster number. We show the self-organized-map approach to be best for both measurement types at higher numbers of clusters. Clusters of genes derived from single- and average-linkage hierarchical clustering tend to produce worse-than-random results.[The algorithm described is available at http://llama.med.harvard.edu, under Software.]
- Subjects :
- Quality Control
Genetics
Available expression
business.industry
Gene Expression Profiling
Computational Biology
Pattern recognition
Saccharomyces cerevisiae
Mutual information
Gene Annotation
Biology
Hierarchical clustering
Euclidean distance
Determining the number of clusters in a data set
Gene expression profiling
Data Interpretation, Statistical
Gene Expression Regulation, Fungal
Methods
Cluster Analysis
Artificial intelligence
business
Cluster analysis
Algorithms
Genetics (clinical)
Subjects
Details
- ISSN :
- 15495469 and 10889051
- Volume :
- 12
- Database :
- OpenAIRE
- Journal :
- Genome Research
- Accession number :
- edsair.doi.dedup.....a0cc9659b5e77e21718251356736d12d
- Full Text :
- https://doi.org/10.1101/gr.397002