Back to Search Start Over

r-indexing the eBWT.

Authors :
Boucher, Christina
Cenzato, Davide
Lipták, Zsuzsanna
Rossi, Massimiliano
Sciortino, Marinella
Source :
Information & Computation. Jun2024, Vol. 298, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

The extended Burrows-Wheeler Transform (eBWT) [Mantaci et al. TCS 2007] is a variant of the BWT, introduced for collections of strings. In this paper, we present the extended r-index , an analogous data structure to the r -index [Gagie et al. JACM 2020]. It occupies O (r) words, with r the number of runs of the eBWT, and offers the same functionalities as the r -index. We also show how to efficiently support finding maximal exact matches (MEMs). We implemented the extended r -index and tested it on circular bacterial genomes and plasmids, comparing it to five state-of-the-art compressed text indexes. While our data structure maintains similar time and memory requirements for answering pattern matching queries as the original r -index, it is the only index in the literature that can naturally be used for both circular and linear input collections. This is an extended version of [Boucher et al., r-indexing the eBWT, SPIRE 2021]. • Introducing the extended r -index (r -index for the eBWT). • Full cyclic pattern matching functionality in O (r) space. • Efficient construction of the extended r -index. • Computation of MEMs with the extended r -index. • Experimental comparison with several compressed indexes on real biological data. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
08905401
Volume :
298
Database :
Academic Search Index
Journal :
Information & Computation
Publication Type :
Academic Journal
Accession number :
177319938
Full Text :
https://doi.org/10.1016/j.ic.2024.105155