Back to Search
Start Over
Addressing confounding artifacts in reconstruction of gene co-expression networks
- Source :
- Genome Biology, Genome Biology, Vol 20, Iss 1, Pp 1-6 (2019)
- Publication Year :
- 2017
- Publisher :
- Cold Spring Harbor Laboratory, 2017.
-
Abstract
- BackgroundGene co-expression networks capture diverse biological relationships between genes, and are important tools in predicting gene function and understanding disease mechanisms. Functional interactions between genes have not been fully characterized for most organisms, and therefore reconstruction of gene co-expression networks has been of common interest in a variety of settings. However, methods routinely used for reconstruction of gene co-expression networks do not account for confounding artifacts known to affect high dimensional gene expression measurements.ResultsIn this study, we show that artifacts such as batch effects in gene expression data confound commonly used network reconstruction algorithms. Both theoretically and empirically, we demonstrate that removing the effects of top principal components from gene expression measurements prior to network inference can reduce false discoveries, especially when well annotated technical covariates are not available. Using expression data from the GTEx project in multiple tissues and hundreds of individuals, we show that this latent factor residualization approach often reduces false discoveries in the reconstructed networks.ConclusionNetwork reconstruction is susceptible to confounders that affect measurements of gene expression. Even controlling for major individual known technical covariates fails to fully eliminate confounding variation from the data. In studies where a wide range of annotated technical factors are measured and available, correcting gene expression data with multiple covariates can also improve network reconstruction, but such extensive annotations are not always available. Our study shows that principal component correction, which does not depend on study design or annotation of all relevant confounders, removes patterns of artifactual variation and improves network reconstruction in both simulated data, and gene expression data from GTEx project. We have implemented our PC correction approach in the Bioconductor package sva which can be used prior to network reconstruction with a range of methods.
- Subjects :
- lcsh:QH426-470
Gene regulatory network
Short Report
Inference
Genomics
Computational biology
Biology
Machine learning
computer.software_genre
03 medical and health sciences
0302 clinical medicine
Humans
Gene Regulatory Networks
lcsh:QH301-705.5
Gene
030304 developmental biology
0303 health sciences
Mechanism (biology)
business.industry
Disease mechanisms
Confounding
Human genetics
Expression (mathematics)
lcsh:Genetics
lcsh:Biology (General)
Genetic Techniques
Principal component analysis
Artificial intelligence
Artifacts
Precision and recall
business
computer
030217 neurology & neurosurgery
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- Genome Biology, Genome Biology, Vol 20, Iss 1, Pp 1-6 (2019)
- Accession number :
- edsair.doi.dedup.....7f138e3db25f5ba201f22cc936aab33f
- Full Text :
- https://doi.org/10.1101/202903