Back to Search
Start Over
Optimal combination of feature selection and classification via local hyperplane based learning strategy
- Source :
- BMC Bioinformatics
- Publication Year :
- 2015
- Publisher :
- Springer Science and Business Media LLC, 2015.
-
Abstract
- Background Classifying cancers by gene selection is among the most important and challenging procedures in biomedicine. A major challenge is to design an effective method that eliminates irrelevant, redundant, or noisy genes from the classification, while retaining all of the highly discriminative genes. Results We propose a gene selection method, called local hyperplane-based discriminant analysis (LHDA). LHDA adopts two central ideas. First, it uses a local approximation rather than global measurement; second, it embeds a recently reported classification model, K-Local Hyperplane Distance Nearest Neighbor(HKNN) classifier, into its discriminator. Through classification accuracy-based iterations, LHDA obtains the feature weight vector and finally extracts the optimal feature subset. The performance of the proposed method is evaluated in extensive experiments on synthetic and real microarray benchmark datasets. Eight classical feature selection methods, four classification models and two popular embedded learning schemes, including k-nearest neighbor (KNN), hyperplane k-nearest neighbor (HKNN), Support Vector Machine (SVM) and Random Forest are employed for comparisons. Conclusion The proposed method yielded comparable to or superior performances to seven state-of-the-art models. The nice performance demonstrate the superiority of combining feature weighting with model learning into an unified framework to achieve the two tasks simultaneously. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0629-6) contains supplementary material, which is available to authorized users.
- Subjects :
- Support Vector Machine
Computer science
Feature weighting
Local hyperplane
Feature selection
computer.software_genre
Biochemistry
k-nearest neighbors algorithm
Machine Learning
Discriminative model
Structural Biology
Neoplasms
Feature (machine learning)
Cluster Analysis
Humans
Gene Regulatory Networks
Molecular Biology
business.industry
Gene Expression Profiling
Methodology Article
Applied Mathematics
Discriminant Analysis
Pattern recognition
Classification
Linear discriminant analysis
Local learning
Computer Science Applications
Random forest
Support vector machine
ComputingMethodologies_PATTERNRECOGNITION
Hyperplane
Data mining
Artificial intelligence
business
computer
HKNN
Subjects
Details
- ISSN :
- 14712105
- Volume :
- 16
- Database :
- OpenAIRE
- Journal :
- BMC Bioinformatics
- Accession number :
- edsair.doi.dedup.....d4998beec9369ef6684c2d1e1b282a14
- Full Text :
- https://doi.org/10.1186/s12859-015-0629-6