1. Multi-omic modelling of inflammatory bowel disease with regularized canonical correlation analysis
- Author
-
Anna Carrasco, Aida Mayorgas, Dirk Haller, Maria Esteve, Eva Tristán, Ana Maria Corraliza, Elena Ricart, A. Salas, Amira Metwaly, Julián Panés, Lluís Revilla, Juan José Lozano, and Maria Carme Masamunt
- Subjects
0301 basic medicine ,Male ,Multivariate analysis ,Computer science ,Cell Transplantation ,Biopsy ,Datasets as Topic ,Crohn's Disease ,Pathogenesis ,Pathology and Laboratory Medicine ,Inflammatory bowel disease ,0302 clinical medicine ,Transcripció genètica ,Medicine and Health Sciences ,Blood and Lymphatic System Procedures ,Gliomas ,Neurological Tumors ,Hyperparameter ,Multidisciplinary ,Genetic transcription ,Hematopoietic Stem Cell Transplantation ,Genomics ,Glioma ,Explained variation ,Oncology ,Neurology ,Medical Microbiology ,Generalized canonical correlation ,Host-Pathogen Interactions ,Medicine ,030211 gastroenterology & hepatology ,Female ,Canonical correlation ,Transcriptome Analysis ,Research Article ,Adult ,Adolescent ,Science ,Immunology ,Microbiota intestinal ,Surgical and Invasive Medical Procedures ,Computational biology ,Microbial Genomics ,Gastroenterology and Hepatology ,Biology ,Microbiology ,Autoimmune Diseases ,03 medical and health sciences ,Young Adult ,Hematopoesi ,medicine ,Genetics ,Humans ,Microbiome ,Gastrointestinal microbiome ,Transplantation ,business.industry ,Model selection ,Inflammatory Bowel Disease ,Biology and Life Sciences ,Computational Biology ,Cancers and Neoplasms ,medicine.disease ,Inflammatory Bowel Diseases ,Genome Analysis ,Hematopoiesis ,Gastrointestinal Microbiome ,Data set ,030104 developmental biology ,Multivariate Analysis ,Clinical Immunology ,Personalized medicine ,Clinical Medicine ,business ,Transcriptome ,Stem Cell Transplantation - Abstract
Background Personalized medicine requires finding relationships between variables that influence a patient’s phenotype and predicting an outcome. Sparse generalized canonical correlation analysis identifies relationships between different groups of variables. This method requires establishing a model of the expected interaction between those variables. Describing these interactions is challenging when the relationship is unknown or when there is no pre-established hypothesis. Thus, our aim was to develop a method to find the relationships between microbiome and host transcriptome data and the relevant clinical variables in a complex disease, such as Crohn’s disease. Results We present here a method to identify interactions based on canonical correlation analysis. We show that the model is the most important factor to identify relationships between blocks using a dataset of Crohn’s disease patients with longitudinal sampling. First the analysis was tested in two previously published datasets: a glioma and a Crohn’s disease and ulcerative colitis dataset where we describe how to select the optimum parameters. Using such parameters, we analyzed our Crohn’s disease data set. We selected the model with the highest inner average variance explained to identify relationships between transcriptome, gut microbiome and clinically relevant variables. Adding the clinically relevant variables improved the average variance explained by the model compared to multiple co-inertia analysis. Conclusions The methodology described herein provides a general framework for identifying interactions between sets of omic data and clinically relevant variables. Following this method, we found genes and microorganisms that were related to each other independently of the model, while others were specific to the model used. Thus, model selection proved crucial to finding the existing relationships in multi-omics datasets.
- Published
- 2020
- Full Text
- View/download PDF