Back to Search Start Over

Gemini: memory-efficient integration of hundreds of gene networks with high-order pooling.

Authors :
Woicik, Addie
Zhang, Mingxin
Xu, Hanwen
Mostafavi, Sara
Wang, Sheng
Source :
Bioinformatics. 2023 Supplement, Vol. 39, pi504-i512. 9p.
Publication Year :
2023

Abstract

Motivation The exponential growth of genomic sequencing data has created ever-expanding repositories of gene networks. Unsupervised network integration methods are critical to learn informative representations for each gene, which are later used as features for downstream applications. However, these network integration methods must be scalable to account for the increasing number of networks and robust to an uneven distribution of network types within hundreds of gene networks. Results To address these needs, we present Gemini, a novel network integration method that uses memory-efficient high-order pooling to represent and weight each network according to its uniqueness. Gemini then mitigates the uneven network distribution through mixing up existing networks to create many new networks. We find that Gemini leads to more than a 10% improvement in F 1 score, 15 % improvement in micro-AUPRC, and 63 % improvement in macro-AUPRC for human protein function prediction by integrating hundreds of networks from BioGRID, and that Gemini's performance significantly improves when more networks are added to the input network collection, while Mashup and BIONIC embeddings' performance deteriorates. Gemini thereby enables memory-efficient and informative network integration for large gene networks and can be used to massively integrate and analyze networks in other domains. Availability and implementation Gemini can be accessed at: https://github.com/MinxZ/Gemini. [ABSTRACT FROM AUTHOR]

Subjects

Subjects :
*GENE regulatory networks
*BIONICS

Details

Language :
English
ISSN :
13674803
Volume :
39
Database :
Academic Search Index
Journal :
Bioinformatics
Publication Type :
Academic Journal
Accession number :
164654398
Full Text :
https://doi.org/10.1093/bioinformatics/btad247