Back to Search
Start Over
Modeling and Computing Probabilistic Skyline on Incomplete Data.
- Source :
- IEEE Transactions on Knowledge & Data Engineering; Jul2020, Vol. 32 Issue 7, p1405-1418, 14p
- Publication Year :
- 2020
-
Abstract
- The skyline query is important in the database community. In recent years, the researches on incomplete data have been increasingly considered, especially for the skyline query. However, the existing skyline definition on incomplete data cannot provide users with valuable references. In this paper, we propose a novel skyline definition utilizing probabilistic model on incomplete data where each point has a probability to be in the skyline. In particular, it returns K points with the highest skyline probabilities. In addition, we propose incomplete models and estimate probability density functions of missing values on independent, correlated, and anti-correlated distributions, respectively. Meanwhile, it is a big challenge to compute probabilistic skyline on incomplete data. We propose three efficient algorithms SPISkyline, SPCSkyline, and SPASkyline for probabilistic skyline computation on incomplete data complying with independent, correlated, and anti-correlated distributions, respectively. They employ pruning strategy, optimization of the process of probability computation, and sorting technique to improve the efficiency of probabilistic skyline computation on incomplete data. Our experimental results demonstrate that our proposed concept of probabilistic skyline is an effective method to tackle skyline query on incomplete data and our algorithms are tens of times faster than the naive algorithm on both synthetic and real datasets. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 10414347
- Volume :
- 32
- Issue :
- 7
- Database :
- Complementary Index
- Journal :
- IEEE Transactions on Knowledge & Data Engineering
- Publication Type :
- Academic Journal
- Accession number :
- 143721606
- Full Text :
- https://doi.org/10.1109/TKDE.2019.2904967