1. Multiple approaches to data-mining of proteomic data based on statistical and pattern classification methods
- Author
-
Craig A. Struble, Xin Feng, Chin-Fu Chen, Nancy Sobczak, Hao Jiang, Roumyana Kirova, Peter J. Tonellato, Jacob W. Tatay, and Nan Jiang Wang
- Subjects
Proteomics ,Computer science ,Machine learning ,computer.software_genre ,Biochemistry ,Fuzzy logic ,Mass Spectrometry ,Protein expression ,Artificial Intelligence ,Humans ,Disease ,Databases, Protein ,Molecular Biology ,Artificial neural network ,business.industry ,Computational Biology ,Proteins ,Support vector machine ,ComputingMethodologies_PATTERNRECOGNITION ,Kernel method ,Principal component analysis ,Classification methods ,Neural Networks, Computer ,Data mining ,Artificial intelligence ,business ,computer - Abstract
The data-mining challenge presented is composed of two fundamental problems. Problem one is the separation of forty-one subjects into two classifications based on the data produced by the mass spectrometry of protein samples from each subject. Problem two is to find the specific differences between protein expression data of two sets of subjects. In each problem, one group of subjects has a disease, while the other group is nondiseased. Each problem was approached with the intent to introduce a new and potentially useful tool to analyze protein expression from mass spectrometry data. A variety of methodologies, both conventional and nonconventional were used in the analysis of these problems. The results presented show both overlap and discrepancies. What is important is the breadth of the techniques and the future direction this analysis will create.
- Published
- 2003