1. Model guided trait-specific co-expression network estimation as a new perspective for identifying molecular interactions and pathways
- Author
-
Juho A. J. Kontio, Tanja Pyhäjärvi, Mikko J. Sillanpää, Viikki Plant Science Centre (ViPS), Department of Forest Sciences, Forest Genomics, Forest Ecology and Management, and Department of Mathematics and Statistics
- Subjects
SELECTION ,PROTEIN EXPRESSION ,Computer science ,Gene Identification and Analysis ,Normal Distribution ,Genetic Networks ,LASSO ,computer.software_genre ,Hematologic Cancers and Related Disorders ,ACTIVATION ,0302 clinical medicine ,Mathematical and Statistical Techniques ,Medicine and Health Sciences ,TRANSCRIPTION ,Biology (General) ,Parametric statistics ,0303 health sciences ,Leukemia ,Covariance ,Systems Biology ,Statistics ,1184 Genetics, developmental biology, physiology ,Hematology ,ASSOCIATION ,Myeloid Leukemia ,CANCER ,Variety (cybernetics) ,Identification (information) ,Phenotypes ,Leukemia, Myeloid, Acute ,Oncology ,Research Design ,030220 oncology & carcinogenesis ,Parametric model ,Physical Sciences ,Network Analysis ,Algorithms ,Research Article ,Acute Myeloid Leukemia ,Computer and Information Sciences ,Process (engineering) ,Clinical Research Design ,QH301-705.5 ,Systems biology ,Machine learning ,Research and Analysis Methods ,Proof of Concept Study ,SIGNALING PATHWAYS ,03 medical and health sciences ,Genetics ,Humans ,GENE-GENE INTERACTIONS ,EPISTASIS ,Statistical Methods ,030304 developmental biology ,business.industry ,Biology and Life Sciences ,Cancers and Neoplasms ,Random Variables ,Probability Theory ,Probability Distribution ,Survival Analysis ,Gene Expression Regulation ,Artificial intelligence ,business ,computer ,Biological network ,Mathematics - Abstract
A wide variety of 1) parametric regression models and 2) co-expression networks have been developed for finding gene-by-gene interactions underlying complex traits from expression data. While both methodological schemes have their own well-known benefits, little is known about their synergistic potential. Our study introduces their methodological fusion that cross-exploits the strengths of individual approaches via a built-in information-sharing mechanism. This fusion is theoretically based on certain trait-conditioned dependency patterns between two genes depending on their role in the underlying parametric model. Resulting trait-specific co-expression network estimation method 1) serves to enhance the interpretation of biological networks in a parametric sense, and 2) exploits the underlying parametric model itself in the estimation process. To also account for the substantial amount of intrinsic noise and collinearities, often entailed by expression data, a tailored co-expression measure is introduced along with this framework to alleviate related computational problems. A remarkable advance over the reference methods in simulated scenarios substantiate the method’s high-efficiency. As proof-of-concept, this synergistic approach is successfully applied in survival analysis, with acute myeloid leukemia data, further highlighting the framework’s versatility and broad practical relevance., Author summary Here we built up a mathematically justified bridge between 1) parametric approaches and 2) co-expression networks in light of identifying molecular interactions underlying complex traits. We first shared our concern that methodological improvements around these schemes, adjusting only their power and scalability, are bounded by more fundamental scheme-specific limitations. Subsequently, our theoretical results were exploited to overcome these limitations to find gene-by-gene interactions neither of which can capture alone. We also aimed to illustrate how this framework enables the interpretation of co-expression networks in a more parametric sense to achieve systematic insights into complex biological processes more reliably. The main procedure was fit for various types of biological applications and high-dimensional data to cover the area of systems biology as broadly as possible. In particular, we chose to illustrate the method’s applicability for gene-profile based risk-stratification in cancer research using public acute myeloid leukemia datasets.
- Published
- 2021