Back to Search
Start Over
Fast and Exact Outlier Detection in Metric Spaces: A Proximity Graph-based Approach
- Publication Year :
- 2021
-
Abstract
- Distance-based outlier detection is widely adopted in many fields, e.g., data mining and machine learning, because it is unsupervised, can be employed in a generic metric space, and does not have any assumptions of data distributions. Data mining and machine learning applications face a challenge of dealing with large datasets, which requires efficient distance-based outlier detection algorithms. Due to the popularization of computational environments with large memory, it is possible to build a main-memory index and detect outliers based on it, which is a promising solution for fast distance-based outlier detection. Motivated by this observation, we propose a novel approach that exploits a proximity graph. Our approach can employ an arbitrary proximity graph and obtains a significant speed-up against state-of-the-art. However, designing an effective proximity graph raises a challenge, because existing proximity graphs do not consider efficient traversal for distance-based outlier detection. To overcome this challenge, we propose a novel proximity graph, MRPG. Our empirical study using real datasets demonstrates that MRPG detects outliers significantly faster than the state-of-the-art algorithms.<br />Comment: Accepted to SIGMOD2021
- Subjects :
- Computer Science - Databases
Subjects
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.2110.08959
- Document Type :
- Working Paper