Back to Search Start Over

How to Hallucinate Functional Proteins

Authors :
Costello, Zak
Martin, Hector Garcia
Publication Year :
2019

Abstract

Here we present a novel approach to protein design and phenotypic inference using a generative model for protein sequences. BioSeqVAE, a variational autoencoder variant, can hallucinate syntactically valid protein sequences that are likely to fold and function. BioSeqVAE is trained on the entire known protein sequence space and learns to generate valid examples of protein sequences in an unsupervised manner. The model is validated by showing that its latent feature space is useful and that it accurately reconstructs sequences. Its usefulness is demonstrated with a selection of relevant downstream design tasks. This work is intended to serve as a computational first step towards a general purpose structure free protein design tool.

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.1903.00458
Document Type :
Working Paper