Back to Search Start Over

DFSP: a Depth-First SPelling algorithm for sequential pattern mining of biological sequences

Authors :
Vance Chiang-Chi Liao
Ming-Syan Chen
Source :
Knowledge and Information Systems. 38:623-639
Publication Year :
2013
Publisher :
Springer Science and Business Media LLC, 2013.

Abstract

Scientific progress in recent years has led to the generation of huge amounts of biological data, most of which remains unanalyzed. Mining the data may provide insights into various realms of biology, such as finding co-occurring biosequences, which are essential for biological data mining and analysis. Data mining techniques like sequential pattern mining may reveal implicitly meaningful patterns among the DNA or protein sequences. If biologists hope to unlock the potential of sequential pattern mining in their field, it is necessary to move away from traditional sequential pattern mining algorithms, because they have difficulty handling a small number of items and long sequences in biological data, such as gene and protein sequences. To address the problem, we propose an approach called Depth-First SPelling (DFSP) algorithm for mining sequential patterns in biological sequences. The algorithm’s processing speed is faster than that of PrefixSpan, its leading competitor, and it is superior to other sequential pattern mining algorithms for biological sequences.

Details

ISSN :
02193116 and 02191377
Volume :
38
Database :
OpenAIRE
Journal :
Knowledge and Information Systems
Accession number :
edsair.doi...........30faa8143cdf3a4ccc6b6d6a8d59046d