Back to Search Start Over

Disease association and comparative genomics of compositional bias in human proteins [version 1; peer review: 1 approved, 1 approved with reservations]

Authors :
Christos E. Kouros
Vasiliki Makri
Christos A. Ouzounis
Anastasia Chasapi
Author Affiliations :
<relatesTo>1</relatesTo>BCCB-AIIA, School of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece<br /><relatesTo>2</relatesTo>BCPL, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas (CERTH), Thessaloniki, Greece
Source :
F1000Research. 12:198
Publication Year :
2023
Publisher :
London, UK: F1000 Research Limited, 2023.

Abstract

Background: The evolutionary rate of disordered proteins varies greatly due to the lack of structural constraints. So far, few studies have investigated the presence/absence patterns of intrinsically disordered regions (IDRs) across phylogenies in conjunction with human disease. In this study, we report a genome-wide analysis of compositional bias association with disease in human proteins and their taxonomic distribution. Methods: The human genome protein set provided by the Ensembl database was annotated and analysed with respect to both disease associations and the detection of compositional bias. The Uniprot Reference Proteome dataset, containing 11297 proteomes was used as target dataset for the comparative genomics of a well-defined subset of the Human Genome, including 100 characteristic, compositionally biased proteins, some linked to disease. Results: Cross-evaluation of compositional bias and disease-association in the human genome reveals a significant bias towards low complexity regions in disease-associated genes, with charged, hydrophilic amino acids appearing as over-represented. The phylogenetic profiling of 17 disease-associated, low complexity proteins across 11297 proteomes captures characteristic taxonomic distribution patterns. Conclusions: This is the first time that a combined genome-wide analysis of low complexity, disease-association and taxonomic distribution of human proteins is reported, covering structural, functional, and evolutionary properties. The reported framework can form the basis for large-scale, follow-up projects, encompassing the entire human genome and all known gene-disease associations.

Details

ISSN :
20461402
Volume :
12
Database :
F1000Research
Journal :
F1000Research
Notes :
[version 1; peer review: 1 approved, 1 approved with reservations]
Publication Type :
Academic Journal
Accession number :
edsfor.10.12688.f1000research.129929.1
Document Type :
research-article
Full Text :
https://doi.org/10.12688/f1000research.129929.1