Back to Search Start Over

Characterization of intrinsically disordered regions in proteins informed by human genetic diversity.

Authors :
Ahmed, Shehab S.
Rifat, Zaara T.
Lohia, Ruchi
Campbell, Arthur J.
Dunker, A. Keith
Rahman, M. Sohel
Iqbal, Sumaiya
Source :
PLoS Computational Biology. 3/11/2022, Vol. 18 Issue 3, p1-28. 28p. 1 Diagram, 1 Chart, 5 Graphs.
Publication Year :
2022

Abstract

All proteomes contain both proteins and polypeptide segments that don't form a defined three-dimensional structure yet are biologically active—called intrinsically disordered proteins and regions (IDPs and IDRs). Most of these IDPs/IDRs lack useful functional annotation limiting our understanding of their importance for organism fitness. Here we characterized IDRs using protein sequence annotations of functional sites and regions available in the UniProt knowledgebase ("UniProt features": active site, ligand-binding pocket, regions mediating protein-protein interactions, etc.). By measuring the statistical enrichment of twenty-five UniProt features in 981 IDRs of 561 human proteins, we identified eight features that are commonly located in IDRs. We then collected the genetic variant data from the general population and patient-based databases and evaluated the prevalence of population and pathogenic variations in IDPs/IDRs. We observed that some IDRs tolerate 2 to 12-times more single amino acid-substituting missense mutations than synonymous changes in the general population. However, we also found that 37% of all germline pathogenic mutations are located in disordered regions of 96 proteins. Based on the observed-to-expected frequency of mutations, we categorized 34 IDRs in 20 proteins (DDX3X, KIT, RB1, etc.) as intolerant to mutation. Finally, using statistical analysis and a machine learning approach, we demonstrate that mutation-intolerant IDRs carry a distinct signature of functional features. Our study presents a novel approach to assign functional importance to IDRs by leveraging the wealth of available genetic data, which will aid in a deeper understating of the role of IDRs in biological processes and disease mechanisms. Author summary: Intrinsically disordered regions (IDRs) in proteins are typically not considered to be functionally as important as the structured parts. However, it is becoming evident that both structured and disordered regions are essential for the repertoire of protein functions. Nevertheless, most of these largely flexible and functionally dynamic protein regions remain uncharacterized. Nevertheless, most of these largely flexible and functionally dynamic protein regions remain uncharacterized. Here, informed by human genetic diversity (i.e., genetic variations from the general population and patients), we identified the IDRs that are more frequently mutated in patients than in relatively healthy individuals, and further show that they carry a set of characteristic functional features. This approach provides a different and effective means to identify unannotated disordered protein segments that are biologically important and lead to pathogenesis upon mutation. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
1553734X
Volume :
18
Issue :
3
Database :
Academic Search Index
Journal :
PLoS Computational Biology
Publication Type :
Academic Journal
Accession number :
155690770
Full Text :
https://doi.org/10.1371/journal.pcbi.1009911