Back to Search Start Over

Understanding and Optimizing Conjunctive Predicates Under Memory-Efficient Storage Layouts.

Authors :
Wang, Zeke
Liu, Xue
Zhang, Kai
Zhou, Haihang
He, Bingsheng
Source :
IEEE Transactions on Knowledge & Data Engineering. Jun2021, Vol. 33 Issue 6, p2803-2817. 15p.
Publication Year :
2021

Abstract

Database queries can contain multiple predicates. The optimization of conjunctive predicates is still vital to the overall performance of analytic data processing tasks. Prior work proposes several memory-efficient storage layouts, e.g., BitWeaving and ByteSlice, to significantly accelerate predicate evaluation, as circuit-level intra-cycle parallelism available in modern CPUs can be exploited such that the total number of instructions can be dramatically reduced. However, the performance potential of conjunctive predicates has not been harvested yet under such storage layouts as there is no accurate cost model to provide necessary insights that guide the optimization process. In this paper, we propose a hybrid empirical/analytical cost model (Understanding) to unveil the performance characteristics of such storage layouts when applying to predicate evaluation. Our cost model takes into account effect of non-linear factors, e.g., cache miss and branch misprediction, and easily applies to different CPUs. The main finding from our cost model is to distinguish high-cost instruction (which suffers from cache miss and/or branch misprediction) from low-cost instruction (which enjoys cache hit and correct branch prediction) in the context of predicate evaluation under these storage layouts. Guided by such a finding, we propose a simple execution scheme Hebe (Optimizing), which is order-oblivious while maintaining high performance. Hebe is attractive to the query optimizer (QO), as the QO does not need to go through a sampling process to decide the optimal evaluation order in advance. The intuition behind Hebe is to significantly reduce the number of high-cost instructions while keeping low-cost instructions unchanged. Our finding from Hebe sheds light on the importance of accurate cost model that guide us to derive an efficient execution scheme for query processing on modern CPUs. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10414347
Volume :
33
Issue :
6
Database :
Academic Search Index
Journal :
IEEE Transactions on Knowledge & Data Engineering
Publication Type :
Academic Journal
Accession number :
150287542
Full Text :
https://doi.org/10.1109/TKDE.2019.2958672