Back to Search Start Over

LSH-Based Probabilistic Pruning of Inverted Indices for Sets and Ranked Lists

Authors :
Koninika Pal
Sebastian Michel
Source :
WebDB
Publication Year :
2017
Publisher :
ACM, 2017.

Abstract

We address the problem of index pruning without compromising the quality of ad-hoc similarity search among sets and ranked lists. We discuss three different ways to prune the index structure and, by linking the index structure with the concept of Locality Sensitive Hashing (LSH), we introduce two solutions to query processing over the pruned index. Through a probabilistic analysis we ensure that a user-defined recall goal is still guaranteed. We are able to formulate an optimization problem that can determine the optimal pruning factor for all three pruning methods. The experimental evaluations over real-world data validate that the optimal pruning factor indeed ensures the recall goal without any significant effect on the quality of similarity search on a much smaller index.

Details

Database :
OpenAIRE
Journal :
Proceedings of the 20th International Workshop on the Web and Databases
Accession number :
edsair.doi...........9a89c55ff997418956fc7582ef40318f
Full Text :
https://doi.org/10.1145/3068839.3068845