Back to Search Start Over

SADeepcry: a deep learning framework for protein crystallization propensity prediction using self-attention and auto-encoder networks.

Authors :
Wang, Shaokai
Zhao, Haochen
Source :
Briefings in Bioinformatics. Sep2022, Vol. 23 Issue 5, p1-9. 9p.
Publication Year :
2022

Abstract

The X-ray diffraction (XRD) technique based on crystallography is the main experimental method to analyze the three-dimensional structure of proteins. The production process of protein crystals on which the XRD technique relies has undergone multiple experimental steps, which requires a lot of manpower and material resources. In addition, studies have shown that not all proteins can form crystals under experimental conditions, and the success rate of the final crystallization of proteins is only <10%. Although some protein crystallization predictors have been developed, not many tools capable of predicting multi-stage protein crystallization propensity are available and the accuracy of these tools is not satisfactory. In this paper, we propose a novel deep learning framework, named SADeepcry, for predicting protein crystallization propensity. The framework can be used to estimate the three steps (protein material production, purification and crystallization) in protein crystallization experiments and the success rate of the final protein crystallization. SADeepcry uses the optimized self-attention and auto-encoder modules to extract sequence, structure and physicochemical features from the proteins. Compared with other state-of-the-art protein crystallization propensity prediction models, SADeepcry can obtain more complex global spatial long-distance dependence of protein sequence information. Our computational results show that SADeepcry has increased Matthews correlation coefficient and area under the curve, by 100.3% and 13.4%, respectively, over the DCFCrystal method on the benchmark dataset. The codes of SADeepcry are available at https://github.com/zhc940702/SADeepcry. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
14675463
Volume :
23
Issue :
5
Database :
Academic Search Index
Journal :
Briefings in Bioinformatics
Publication Type :
Academic Journal
Accession number :
159311854
Full Text :
https://doi.org/10.1093/bib/bbac352