Back to Search Start Over

A modified Markov clustering approach to unsupervised classification of protein sequences

Authors :
Szilágyi, László
Medvés, Lehel
Szilágyi, Sándor M.
Source :
Neurocomputing. Aug2010, Vol. 73 Issue 13-15, p2332-2345. 14p.
Publication Year :
2010

Abstract

Abstract: In this paper we propose a modified Markov clustering algorithm for efficient and accurate clustering of large protein sequence databases, based on previously evaluated sequence similarity criteria. The proposed modification consists in an exponentially decreasing inflation rate, which aims at helping the quick creation of the hard structure of clusters by using a strong inflation in the beginning, and at producing fine partitions with a weaker inflation thereafter. The algorithm, which was tested and validated using the whole SCOP95 database, or randomly selected 10–50% sections, generally converges within 12–14 iteration cycles and provides clusters of high quality. Furthermore, a novel generalized formula for the inflation operation is given, and an efficient matrix symmetrization technique is presented, in order to improve the partition quality with relatively low amount of extra computations. Finally, an extra speedup is achieved via excluding isolated proteins from further processing. The proposed method performs better than previous solutions, from the point of view of partition quality, and computational load as well. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09252312
Volume :
73
Issue :
13-15
Database :
Academic Search Index
Journal :
Neurocomputing
Publication Type :
Academic Journal
Accession number :
52875191
Full Text :
https://doi.org/10.1016/j.neucom.2010.02.023