1. Transforming the Language of Life: Transformer Neural Networks for Protein Prediction Tasks
- Author
-
Sergei Maslov, Mark Hopkins, Simon Liu, Anna Ritz, Maeve Heflin, and Ananthan Nambiar
- Subjects
0303 health sciences ,Coronavirus disease 2019 (COVID-19) ,Artificial neural network ,Protein family ,Computer science ,business.industry ,Deep learning ,Machine learning ,computer.software_genre ,03 medical and health sciences ,0302 clinical medicine ,ComputingMethodologies_PATTERNRECOGNITION ,Protein sequencing ,Protein–protein interaction prediction ,Artificial intelligence ,business ,computer ,030217 neurology & neurosurgery ,030304 developmental biology ,Transformer (machine learning model) - Abstract
The scientific community is rapidly generating protein sequence information, but only a fraction of these proteins can be experimentally characterized. While promising deep learning approaches for protein prediction tasks have emerged, they have computational limitations or are designed to solve a specific task. We present a Transformer neural network that pre-trains task-agnostic sequence representations. This model is fine-tuned to solve two different protein prediction tasks: protein family classification and protein interaction prediction. Our method is comparable to existing state-of-the art approaches for protein family classification, while being much more general than other architectures. Further, our method outperforms all other approaches for protein interaction prediction. These results offer a promising framework for fine-tuning the pre-trained sequence representations for other protein prediction tasks.
- Published
- 2020
- Full Text
- View/download PDF