Back to Search
Start Over
Classification and feature selection algorithms for multi-class CGH data
- Source :
- ISMB, Bioinformatics
- Publication Year :
- 2008
- Publisher :
- Oxford University Press (OUP), 2008.
-
Abstract
- Recurrent chromosomal alterations provide cytological and molecular positions for the diagnosis and prognosis of cancer. Comparative genomic hybridization (CGH) has been useful in understanding these alterations in cancerous cells. CGH datasets consist of samples that are represented by large dimensional arrays of intervals. Each sample consists of long runs of intervals with losses and gains. In this article, we develop novel SVM-based methods for classification and feature selection of CGH data. For classification, we developed a novel similarity kernel that is shown to be more effective than the standard linear kernel used in SVM. For feature selection, we propose a novel method based on the new kernel that iteratively selects features that provides the maximum benefit for classification. We compared our methods against the best wrapper-based and filter-based approaches that have been used for feature selection of large dimensional biological data. Our results on datasets generated from the Progenetix database, suggests that our methods are considerably superior to existing methods. Availability: All software developed in this article can be downloaded from http://plaza.ufl.edu/junliu/feature.tar.gz Contact: juliu@cise.ufl.edu
- Subjects :
- Statistics and Probability
Gene Dosage
information science
Feature selection
Biology
computer.software_genre
Biochemistry
Pattern Recognition, Automated
Software
Artificial Intelligence
Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto
Molecular Biology
Oligonucleotide Array Sequence Analysis
Biological data
business.industry
Chromosome Mapping
Pattern recognition
Sequence Analysis, DNA
Filter (signal processing)
Comparative Genomics
Original Papers
Computer Science Applications
Support vector machine
Computational Mathematics
ComputingMethodologies_PATTERNRECOGNITION
Computational Theory and Mathematics
Kernel (statistics)
Pattern recognition (psychology)
Artificial intelligence
Data mining
business
Sequence Alignment
computer
Algorithms
Comparative genomic hybridization
Subjects
Details
- ISSN :
- 13674811 and 13674803
- Volume :
- 24
- Database :
- OpenAIRE
- Journal :
- Bioinformatics
- Accession number :
- edsair.doi.dedup.....10dbd38d2e7acf6b9be1c79602c499ec