Back to Search
Start Over
LSH-Based Probabilistic Pruning of Inverted Indices for Sets and Ranked Lists
- Source :
- WebDB
- Publication Year :
- 2017
- Publisher :
- ACM, 2017.
-
Abstract
- We address the problem of index pruning without compromising the quality of ad-hoc similarity search among sets and ranked lists. We discuss three different ways to prune the index structure and, by linking the index structure with the concept of Locality Sensitive Hashing (LSH), we introduce two solutions to query processing over the pruned index. Through a probabilistic analysis we ensure that a user-defined recall goal is still guaranteed. We are able to formulate an optimization problem that can determine the optimal pruning factor for all three pruning methods. The experimental evaluations over real-world data validate that the optimal pruning factor indeed ensures the recall goal without any significant effect on the quality of similarity search on a much smaller index.
- Subjects :
- Optimization problem
Index (economics)
Computer science
business.industry
Nearest neighbor search
InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL
Probabilistic logic
02 engineering and technology
computer.software_genre
Machine learning
Locality-sensitive hashing
Principal variation search
020204 information systems
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Probabilistic analysis of algorithms
Data mining
Artificial intelligence
Pruning (decision trees)
business
computer
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Proceedings of the 20th International Workshop on the Web and Databases
- Accession number :
- edsair.doi...........9a89c55ff997418956fc7582ef40318f
- Full Text :
- https://doi.org/10.1145/3068839.3068845