1. Effective estimation of the minimum number of amino acid residues required for functional divergence between duplicate genes
- Author
-
Yangyun Zou, Jingqi Zhou, Dangyun Liu, Zhining Sa, Xun Gu, and Wei Huang
- Subjects
0301 basic medicine ,Statistics as Topic ,Value (computer science) ,Paralogous Gene ,Computational biology ,Biology ,Evolution, Molecular ,03 medical and health sciences ,Genes, Duplicate ,Phylogenetics ,Gene Duplication ,Databases, Genetic ,Gene duplication ,Genetics ,Cutoff ,Computer Simulation ,Amino Acids ,Molecular Biology ,Gene ,Phylogeny ,Ecology, Evolution, Behavior and Systematics ,030104 developmental biology ,Order (biology) ,Multigene Family ,Functional divergence - Abstract
One of hot research foci has always been predicting amino acid residues underlying functional divergence after gene duplication, as those predicted sites can be used as candidates for further functional experimentations. It is important and interesting to know how many sites, on average, may have been responsible for the functional divergence between duplicate genes. In this article, we studied two basic types of functional divergence (type-I and type-II) in depth in order to give an accurate estimation of functional divergence-related sites. Type-I divergences result from altered functional constraints (i.e., different evolutionary rates) between duplicate genes, whereas type-II divergences refer to residues that are conserved by functional constraints but exhibit different physicochemical properties (e.g., charge or hydrophobicity) between duplicates. An effective site number (NE) strategy was applied in our study, which implements a stepwise regression model to calculate the minimum number of residues responsible for functional divergence without choosing preset threshold. We found that NE-determined cut-off value varies among different duplicate pairs, suggesting that empirical cutoff value is not suitable for every case. Under our standard NE calculation method, we estimated less than 15% of residues that are required for paralogous gene functional divergence. Finally, we established a database, DIVERGE-D, as a public resource for the predicted NE sites between two paralogs in this study, which can be used as candidates for further biological engineering and experimentation.
- Published
- 2017