Back to Search Start Over

Trigram-Based Persistent IDE Indices with Quick Startup

Authors :
Iakovlev, Zakhar
Chulkov, Alexey
Golikov, Nikita
Lukianov, Vyacheslav
Zinoviev, Nikita
Ivanov, Dmitry
Aksenov, Vitaly
Publication Year :
2024

Abstract

One common way to speed up the find operation within a set of text files involves a trigram index. This structure is merely a map from a trigram (sequence consisting of three characters) to a set of files which contain it. When searching for a pattern, potential file locations are identified by intersecting the sets related to the trigrams in the pattern. Then, the search proceeds only in these files. However, in a code repository, the trigram index evolves across different versions. Upon checking out a new version, this index is typically built from scratch, which is a time-consuming task, while we want our index to have almost zero-time startup. Thus, we explore the persistent version of a trigram index for full-text and key word patterns search. Our approach just uses the current version of the trigram index and applies only the changes between versions during checkout, significantly enhancing performance. Furthermore, we extend our data structure to accommodate CamelHump search for class and function names.

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2403.03751
Document Type :
Working Paper