Back to Search Start Over

A Multimodal Protein Representation Framework for Quantifying Transferability Across Biochemical Downstream Tasks

Authors :
Fan Hu
Yishen Hu
Weihong Zhang
Huazhen Huang
Yi Pan
Peng Yin
Source :
Advanced Science, Vol 10, Iss 22, Pp n/a-n/a (2023)
Publication Year :
2023
Publisher :
Wiley, 2023.

Abstract

Abstract Proteins are the building blocks of life, carrying out fundamental functions in biology. In computational biology, an effective protein representation facilitates many important biological quantifications. Most existing protein representation methods are derived from self‐supervised language models designed for text analysis. Proteins, however, are more than linear sequences of amino acids. Here, a multimodal deep learning framework for incorporating ≈1 million protein sequence, structure, and functional annotation (MASSA) is proposed. A multitask learning process with five specific pretraining objectives is presented to extract a fine‐grained protein‐domain feature. Through pretraining, multimodal protein representation achieves state‐of‐the‐art performance in specific downstream tasks such as protein properties (stability and fluorescence), protein‒protein interactions (shs27k/shs148k/string/skempi), and protein‒ligand interactions (kinase, DUD‐E), while achieving competitive results in secondary structure and remote homology tasks. Moreover, a novel optimal‐transport‐based metric with rich geometry awareness is introduced to quantify the dynamic transferability from the pretrained representation to the related downstream tasks, which provides a panoramic view of the step‐by‐step learning process. The pairwise distances between these downstream tasks are also calculated, and a strong correlation between the inter‐task feature space distributions and adaptability is observed.

Details

Language :
English
ISSN :
21983844
Volume :
10
Issue :
22
Database :
Directory of Open Access Journals
Journal :
Advanced Science
Publication Type :
Academic Journal
Accession number :
edsdoj.687d03d8b9834052aaa3fc3cac28ac6d
Document Type :
article
Full Text :
https://doi.org/10.1002/advs.202301223