1. A classification model for the Leiden proteomics competition.
- Author
-
Hoefsloot HC, Smit S, and Smilde AK
- Subjects
- Biomarkers, Tumor blood, Blood Proteins chemistry, Breast Neoplasms blood, Breast Neoplasms classification, Breast Neoplasms diagnosis, Case-Control Studies, Databases, Protein, Diagnosis, Computer-Assisted, Discriminant Analysis, Humans, Netherlands, Principal Component Analysis, Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization statistics & numerical data, Models, Statistical, Proteomics statistics & numerical data
- Abstract
A strategy is presented to build a discrimination model in proteomics studies. The model is built using cross-validation. This cross-validation step can simply be combined with a variable selection method, called rank products. The strategy is especially suitable for the low-samples-to-variables-ratio (undersampling) case, as is often encountered in proteomics and metabolomics studies. As a classification method, Principal Component Discriminant Analysis is used; however, the methodology can be used with any classifier. A data set containing serum samples from breast cancer patients and healthy controls is analysed. Double cross-validation shows that the sensitivity of the model is 82% and the specificity 86%. Potential putative biomarkers are identified using the variable selection method. In each cross-validation loop a classification model is built. The final classification uses a majority voting scheme from the ensemble classifier.
- Published
- 2008
- Full Text
- View/download PDF