Back to Search Start Over

Discovery and disentanglement of protein aligned pattern clusters to reveal subtle functional subgroups

Authors :
Antonio Sze-Tzo
Andrew K. C. Wong
Pei-Yuan Zhou
Source :
BIBM
Publication Year :
2017
Publisher :
IEEE, 2017.

Abstract

Proteins from the same family have similar functions. Hence, it is important to discover from a protein family conserved sequence patterns with variations to unveil the functionality of a functional domain. Aligned Pattern Clusters (APCs) are knowledge-rich representations comparing with probabilistic models. If significant aligned residue associations (ARAs) were discovered in APCs, they could reveal subtle functional or subgroup characteristics. However, when ARAs corresponding to different subgroups/classes were entangled due to certain subtle factors, to disentangle them to reveal succinct ARA groups is a big challenge. This paper presents a novel method known as Aligned Residual Association Discovery and Disentanglement (ARADD), to meet such challenge. ARADD first constructs an ARA Frequency Matrix (ARAFM) and converts it into a Statistical Residual (SR) Vector Space (SRV) to suppress noise. SR measures the deviation of the observed frequency of an event against that when the occurrence is random. By applying Principal Component Decomposition (PCD) on the SRV, we obtain PCs ranked by their variance. The ARAs of an AR with others can be represented by an AR-vector whose coordinates account for its associations with others. When the projection of an AR vector on a PC Space is reprojected to the SRV (abbreviated by RSRVs), its coordinates reflect the SRs of that AR associating with other ARs. Experiments showed that the ARADD can a) disentangle entangled ARAs in APCs, b) reveal subtle AR clusters relating to classes or ARAs within or between subgroups — significant to proteomic research, drug discovery and personalized medicine

Details

Database :
OpenAIRE
Journal :
2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Accession number :
edsair.doi...........e6a4f503abddbb666a1dc1678364b084
Full Text :
https://doi.org/10.1109/bibm.2017.8217625