Back to Search Start Over

A Cross-Level Information Transmission Network for Hierarchical Omics Data Integration and Phenotype Prediction from a New Genotype

Authors :
Lei Xie
Di He
Source :
Bioinformatics
Publication Year :
2021

Abstract

Motivation An unsolved fundamental problem in biology is to predict phenotypes from a new genotype under environmental perturbations. The emergence of multiple omics data provides new opportunities but imposes great challenges in the predictive modeling of genotype-phenotype associations. Firstly, the high-dimensionality of genomics data and the lack of coherent labeled data often make the existing supervised learning techniques less successful. Secondly, it is challenging to integrate heterogeneous omics data from different resources. Finally, few works have explicitly modeled the information transmission from DNA to phenotype, which involves multiple intermediate molecular types. Higher-level features (e.g. gene expression) usually have stronger discriminative and interpretable power than lower-level features (e.g. somatic mutation). Results We propose a novel Cross-LEvel Information Transmission (CLEIT) network framework to address the above issues. CLEIT aims to represent the asymmetrical multi-level organization of the biological system by integrating multiple incoherent omics data and to improve the prediction power of low-level features. CLEIT first learns the latent representation of the high-level domain then uses it as ground-truth embedding to improve the representation learning of the low-level domain in the form of contrastive loss. Besides, CLEIT can leverage the unlabeled heterogeneous omics data to improve the generalizability of the predictive model. We demonstrate the effectiveness and significant performance boost of CLEIT in predicting anti-cancer drug sensitivity from somatic mutations via the assistance of gene expressions when compared with state-of-the-art methods. CLEIT provides a general framework to model information transmissions and integrate multi-modal data in a multi-level system. Availabilityand implementation The source code is freely available at https://github.com/XieResearchGroup/CLEIT. Supplementary information Supplementary data are available at Bioinformatics online.

Details

ISSN :
13674811
Database :
OpenAIRE
Journal :
Bioinformatics (Oxford, England)
Accession number :
edsair.doi.dedup.....84c9a28561f63af6cdb201652dd955f6