1. BraneMF: integration of biological networks for functional analysis of proteins
- Author
-
Surabhi Jagtap, Abdulkadir Çelikkanat, Aurélie Pirayre, Frédérique Bidard, Laurent Duval, Fragkiskos D Malliaros, Centre de vision numérique (CVN), Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay, OPtimisation Imagerie et Santé (OPIS), Inria Saclay - Ile de France, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre de vision numérique (CVN), Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-CentraleSupélec-Université Paris-Saclay, Department of Applied Mathematics and Computer Science [Lyngby] (DTU Compute), Danmarks Tekniske Universitet = Technical University of Denmark (DTU), IFP Energies nouvelles (IFPEN), and ANR-20-CE23-0009,GraphIA,Apprentissage de représentation évolutive et robuste sur les graphes(2020)
- Subjects
Statistics and Probability ,Proteins ,Saccharomyces cerevisiae ,Biochemistry ,MESH: Cluster Analysis ,MESH: Saccharomyces cerevisiae ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,Cluster Analysis ,MESH: Proteins ,[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] ,Molecular Biology - Abstract
MotivationThe cellular system of a living organism is composed of interacting bio-molecules that control cellular processes at multiple levels. Their correspondences are represented by tightly regulated molecular networks. The increase of omics technologies has favored the generation of large-scale disparate data and the consequent demand for simultaneously using molecular and functional interaction networks: gene co-expression, protein–protein interaction (PPI), genetic interaction and metabolic networks. They are rich sources of information at different molecular levels, and their effective integration is essential to understand cell functioning and their building blocks (proteins). Therefore, it is necessary to obtain informative representations of proteins and their proximity, that are not fully captured by features extracted directly from a single informational level. We propose BraneMF, a novel random walk-based matrix factorization method for learning node representation in a multilayer network, with application to omics data integration.ResultsWe test BraneMF with PPI networks of Saccharomyces cerevisiae, a well-studied yeast model organism. We demonstrate the applicability of the learned features for essential multi-omics inference tasks: clustering, function and PPI prediction. We compare it to the state-of-the-art integration methods for multilayer networks. BraneMF outperforms baseline methods by achieving high prediction scores for a variety of downstream tasks. The robustness of results is assessed by an extensive parameter sensitivity analysis.Availability and implementationBraneMF’s code is freely available at: https://github.com/Surabhivj/BraneMF, along with datasets, embeddings and result files.Supplementary informationSupplementary data are available at Bioinformatics online.
- Published
- 2022