Back to Search Start Over

A Sequence-Based Dynamic Ensemble Learning System for Protein Ligand-Binding Site Prediction.

Authors :
Chen P
Hu S
Zhang J
Gao X
Li J
Xia J
Wang B
Source :
IEEE/ACM transactions on computational biology and bioinformatics [IEEE/ACM Trans Comput Biol Bioinform] 2016 Sep-Oct; Vol. 13 (5), pp. 901-912. Date of Electronic Publication: 2015 Dec 03.
Publication Year :
2016

Abstract

Background: Proteins have the fundamental ability to selectively bind to other molecules and perform specific functions through such interactions, such as protein-ligand binding. Accurate prediction of protein residues that physically bind to ligands is important for drug design and protein docking studies. Most of the successful protein-ligand binding predictions were based on known structures. However, structural information is not largely available in practice due to the huge gap between the number of known protein sequences and that of experimentally solved structures.<br />Results: This paper proposes a dynamic ensemble approach to identify protein-ligand binding residues by using sequence information only. To avoid problems resulting from highly imbalanced samples between the ligand-binding sites and non ligand-binding sites, we constructed several balanced data sets and we trained a random forest classifier for each of them. We dynamically selected a subset of classifiers according to the similarity between the target protein and the proteins in the training data set. The combination of the predictions of the classifier subset to each query protein target yielded the final predictions. The ensemble of these classifiers formed a sequence-based predictor to identify protein-ligand binding sites.<br />Conclusions: Experimental results on two Critical Assessment of protein Structure Prediction datasets and the ccPDB dataset demonstrated that of our proposed method compared favorably with the state-of-the-art.<br />Availability: http://www2.ahu.edu.cn/pchen/web/LigandDSES.htm.

Details

Language :
English
ISSN :
1557-9964
Volume :
13
Issue :
5
Database :
MEDLINE
Journal :
IEEE/ACM transactions on computational biology and bioinformatics
Publication Type :
Academic Journal
Accession number :
26661785
Full Text :
https://doi.org/10.1109/TCBB.2015.2505286