Sorry, I don't understand your search. ×
Back to Search Start Over

Fixed Block Compression Boosting in FM-Indexes: Theory and Practice

Authors :
Simon J. Puglisi
Juha Kärkkäinen
Matthias Petri
Dominik Kempa
Simon Gog
Department of Computer Science
Practical Algorithms and Data Structures on Strings research group / Juha Kärkkäinen
Bioinformatics
Helsinki Institute for Information Technology
Genome-scale Algorithmics research group / Veli Mäkinen
Algorithmic Bioinformatics
Source :
Algorithmica. 81:1370-1391
Publication Year :
2018
Publisher :
Springer Science and Business Media LLC, 2018.

Abstract

The FM index (Ferragina and Manzini in J ACM 52(4):552-581, 2005) is a widely-used compressed data structure that stores a string T in a compressed form and also supports fast pattern matching queries. In this paper, we describe new FM-index variants that combine nice theoretical properties, simple implementation and improved practical performance. Our main theoretical result is a new technique called fixed block compression boosting, which is a simpler and faster alternative to optimal compression boosting and implicit compression boosting used in previous FM-indexes. We also describe several new techniques for implementing fixed-block boosting efficiently, including a new, fast, and space-efficient implementation of wavelet trees. Our extensive experiments show the new indexes to be consistently fast and small relative to the state-of-the-art, and thus they make a good off-the-shelf choice for many applications.

Details

ISSN :
14320541 and 01784617
Volume :
81
Database :
OpenAIRE
Journal :
Algorithmica
Accession number :
edsair.doi.dedup.....77fddf2a504f06382f24d986dd416f7d