Back to Search Start Over

Bayesian Structural Equation Modeling in Multiple Omics Data Integration with Application to Circadian Genes

Authors :
Maity, Arnab Kumar
Lee, Sang Chan
Mallick, Bani K.
Sarkar, Tapasree Roy
Source :
Bioinformatics, 36(13), 3951-3958 (2020)
Publication Year :
2021

Abstract

It is well known that the integration among different data-sources is reliable because of its potential of unveiling new functionalities of the genomic expressions which might be dormant in a single source analysis. Moreover, different studies have justified the more powerful analyses of multi-platform data. Toward this, in this study, we consider the circadian genes' omics profile such as copy number changes and RNA sequence data along with their survival response. We develop a Bayesian structural equation modeling coupled with linear regressions and log normal accelerated failure time regression to integrate the information between these two platforms to predict the survival of the subjects. We place conjugate priors on the regression parameters and derive the Gibbs sampler using the conditional distributions of them. Our extensive simulation study shows that the integrative model provides a better fit to the data than its closest competitor. The analyses of glioblastoma cancer data and the breast cancer data from TCGA, the largest genomics and transcriptomics database, support our findings. The developed method is wrapped in R package semmcmc available at R CRAN.

Details

Database :
arXiv
Journal :
Bioinformatics, 36(13), 3951-3958 (2020)
Publication Type :
Report
Accession number :
edsarx.2112.03330
Document Type :
Working Paper
Full Text :
https://doi.org/10.1093/bioinformatics/btaa286