Back to Search
Start Over
BDD-BASED ANALYSIS OF GAPPED q-GRAM FILTERS
- Source :
- International Journal of Foundations of Computer Science. 16:1121-1134
- Publication Year :
- 2005
- Publisher :
- World Scientific Pub Co Pte Lt, 2005.
-
Abstract
- Recently, there has been a surge of interest in gapped q-gram filters for approximate string matching. Important design parameters for filters are for example the value of q, the filter-threshold and in particular the shape (aka seed) of the filter. A good choice of parameters can improve the performance of a q-gram filter by orders of magnitude and optimizing these parameters is a nontrivial combinatorial problem. We describe a new method for analyzing gapped q-gram filters. This method is simple and generic. It applies to a variety of filters, overcomes many restrictions that are present in existing algorithms and can easily be extended to new filter variants. To implement our approach, we use an extended version of BDDs (Binary Decision Diagrams), a data structure that efficiently represents sets of bit-strings. In a second step, we define a new class of multi-shape filters and analyze these filters with the BDD-based approach. Experiments show that multi-shape filters can outperform the best single-shape filters, which are currently in use, in many aspects. The BDD-based algorithm is crucial for the design and analysis of these new and better multi-shape filters. Our results apply to the k-mismatches problem, i.e. approximate string matching with Hamming distance.
- Subjects :
- 0303 health sciences
Binary decision diagram
Hamming distance
02 engineering and technology
Filter (signal processing)
Approximate string matching
Data structure
Adaptive filter
03 medical and health sciences
Simple (abstract algebra)
0202 electrical engineering, electronic engineering, information engineering
Computer Science (miscellaneous)
020201 artificial intelligence & image processing
Sensitivity (control systems)
Algorithm
030304 developmental biology
Mathematics
Subjects
Details
- ISSN :
- 17936373 and 01290541
- Volume :
- 16
- Database :
- OpenAIRE
- Journal :
- International Journal of Foundations of Computer Science
- Accession number :
- edsair.doi...........dbe55923ce949a4bf28db634f27393c4