Back to Search
Start Over
Joint network and node selection for pathway-based genomic data analysis
- Source :
- Bioinformatics
- Publication Year :
- 2013
- Publisher :
- Oxford University Press, 2013.
-
Abstract
- Motivation: By capturing various biochemical interactions, biological pathways provide insight into underlying biological processes. Given high-dimensional microarray or RNA-sequencing data, a critical challenge is how to integrate them with rich information from pathway databases to jointly select relevant pathways and genes for phenotype prediction or disease prognosis. Addressing this challenge can help us deepen biological understanding of phenotypes and diseases from a systems perspective. Results: In this article, we propose a novel sparse Bayesian model for joint network and node selection. This model integrates information from networks (e.g. pathways) and nodes (e.g. genes) by a hybrid of conditional and generative components. For the conditional component, we propose a sparse prior based on graph Laplacian matrices, each of which encodes detailed correlation structures between network nodes. For the generative component, we use a spike and slab prior over network nodes. The integration of these two components, coupled with efficient variational inference, enables the selection of networks as well as correlated network nodes in the selected networks. Simulation results demonstrate improved predictive performance and selection accuracy of our method over alternative methods. Based on three expression datasets for cancer study and the KEGG pathway database, we selected relevant genes and pathways, many of which are supported by biological literature. In addition to pathway analysis, our method is expected to have a wide range of applications in selecting relevant groups of correlated high-dimensional biomarkers. Availability: The code can be downloaded at www.cs.purdue.edu/homes/szhe/software.html. Contact: alanqi@purdue.edu
- Subjects :
- Statistics and Probability
Databases, Factual
Gene regulatory network
Inference
Gene Expression
Biology
Bayesian inference
computer.software_genre
Machine learning
Biochemistry
03 medical and health sciences
Bayes' theorem
0302 clinical medicine
Component (UML)
Humans
Gene Regulatory Networks
Molecular Biology
Selection (genetic algorithm)
030304 developmental biology
0303 health sciences
business.industry
Node (networking)
Gene Expression Profiling
Bayes Theorem
Genomics
Original Papers
Computer Science Applications
Pancreatic Neoplasms
Computational Mathematics
ComputingMethodologies_PATTERNRECOGNITION
Computational Theory and Mathematics
030220 oncology & carcinogenesis
Data mining
Artificial intelligence
Lymphoma, Large B-Cell, Diffuse
Laplacian matrix
business
Colorectal Neoplasms
computer
Algorithms
Carcinoma, Pancreatic Ductal
Subjects
Details
- Language :
- English
- ISSN :
- 13674811 and 13674803
- Volume :
- 29
- Issue :
- 16
- Database :
- OpenAIRE
- Journal :
- Bioinformatics
- Accession number :
- edsair.doi.dedup.....73e2871bbb20c49e23fdcf6696a3a182