Back to Search
Start Over
Efficient inference for genetic association studies with multiple outcomes
- Publisher :
- Oxford University Press
-
Abstract
- SUMMARY Combined inference for heterogeneous high-dimensional data is critical in modern biology, where clinical and various kinds of molecular data may be available from a single study. Classical genetic association studies regress a single clinical outcome on many genetic variants one by one, but there is an increasing demand for joint analysis of many molecular outcomes and genetic variants in order to unravel functional interactions. Unfortunately, most existing approaches to joint modeling are either too simplistic to be powerful or are impracticable for computational reasons. Inspired by Richardson and others (2010, Bayesian Statistics 9), we consider a sparse multivariate regression model that allows simultaneous selection of predictors and associated responses. As Markov chain Monte Carlo (MCMC) inference on such models can be prohibitively slow when the number of genetic variants exceeds a few thousand, we propose a variational inference approach which produces posterior information very close to that of MCMC inference, at a much reduced computational cost. Extensive numerical experiments show that our approach outperforms popular variable selection methods and tailored Bayesian procedures, dealing within hours with problems involving hundreds of thousands of genetic variants and tens to hundreds of clinical or molecular outcomes.
- Subjects :
- 0301 basic medicine
Statistics and Probability
Clustering high-dimensional data
Multivariate statistics
Variable selection
Sparse multivariate regression
Bayesian probability
Inference
Feature selection
Machine learning
computer.software_genre
Statistics - Applications
03 medical and health sciences
symbols.namesake
Molecular quantitative trait locus analysis
Humans
Genetic Association Studies
Selection (genetic algorithm)
Statistical genetics
Models, Statistical
business.industry
Genetic Variation
Markov chain Monte Carlo
General Medicine
Markov Chains
Bayesian statistics
High-dimensional data
030104 developmental biology
symbols
Artificial intelligence
Statistics, Probability and Uncertainty
business
Variational inference
Monte Carlo Method
computer
Subjects
Details
- Database :
- OpenAIRE
- Accession number :
- edsair.doi.dedup.....6a57c4cd927ba4b983257b9e6a8c2f18