Back to Search Start Over

Estimating Change in Foldability Due to Multipoint Deletions in Protein Structures

Authors :
Pralay Mitra
Kushal Kanti Ghosh
Amit Kumar
Anupam Banerjee
Source :
Journal of chemical information and modeling. 60(12)
Publication Year :
2020

Abstract

Insertions/deletions of amino acids in the protein backbone potentially result in altered structural/functional specifications. They can either contribute positively to the evolutionary process or can result in disease conditions. Despite being the second most prevalent form of protein modification, there are no databases or computational frameworks that delineate harmful multipoint deletions (MPD) from beneficial ones. We introduce a positive unlabeled learning-based prediction framework (PROFOUND) that utilizes fold-level attributes, environment-specific properties, and deletion site-specific properties to predict the change in foldability arising from such MPDs, both in the non-loop and loop regions of protein structures. In the absence of any protein structure dataset to study MPDs, we introduce a dataset with 153 MPD instances that lead to native-like folded structures and 7650 unlabeled MPD instances whose effect on the foldability of the corresponding proteins is unknown. PROFOUND on 10-fold cross-validation on our newly introduced dataset reports a recall of 82.2% (86.6%) and a fall out rate (FR) of 14.2% (20.6%), corresponding to MPDs in the protein loop (non-loop) region. The low FR suggests that the foldability in proteins subject to MPDs is not random and necessitates unique specifications of the deleted region. In addition, we find that additional evolutionary attributes contribute to higher recall and lower FR. The first of a kind foldability prediction system owing to MPD instances and the newly introduced dataset will potentially aid in novel protein engineering endeavors.

Details

ISSN :
1549960X
Volume :
60
Issue :
12
Database :
OpenAIRE
Journal :
Journal of chemical information and modeling
Accession number :
edsair.doi.dedup.....7fb508cd78a98172f31697462c0fdf5a