Back to Search Start Over

QSketch: An Efficient Sketch for Weighted Cardinality Estimation in Streams

Authors :
Qi, Yiyan
Li, Rundong
Wang, Pinghui
Sun, Yufang
Xing, Rui
Publication Year :
2024

Abstract

Estimating cardinality, i.e., the number of distinct elements, of a data stream is a fundamental problem in areas like databases, computer networks, and information retrieval. This study delves into a broader scenario where each element carries a positive weight. Unlike traditional cardinality estimation, limited research exists on weighted cardinality, with current methods requiring substantial memory and computational resources, challenging for devices with limited capabilities and real-time applications like anomaly detection. To address these issues, we propose QSketch, a memory-efficient sketch method for estimating weighted cardinality in streams. QSketch uses a quantization technique to condense continuous variables into a compact set of integer variables, with each variable requiring only 8 bits, making it 8 times smaller than previous methods. Furthermore, we leverage dynamic properties during QSketch generation to significantly enhance estimation accuracy and achieve a lower time complexity of $O(1)$ for updating estimations upon encountering a new element. Experimental results on synthetic and real-world datasets show that QSketch is approximately 30\% more accurate and two orders of magnitude faster than the state-of-the-art, using only $1/8$ of the memory.<br />Comment: 12 pages, 10 figures, accepted by KDD 2024

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2406.19143
Document Type :
Working Paper