Back to Search Start Over

Fast and Exact Outlier Detection in Metric Spaces: A Proximity Graph-based Approach

Authors :
Amagata, Daichi
Onizuka, Makoto
Hara, Takahiro
Publication Year :
2021

Abstract

Distance-based outlier detection is widely adopted in many fields, e.g., data mining and machine learning, because it is unsupervised, can be employed in a generic metric space, and does not have any assumptions of data distributions. Data mining and machine learning applications face a challenge of dealing with large datasets, which requires efficient distance-based outlier detection algorithms. Due to the popularization of computational environments with large memory, it is possible to build a main-memory index and detect outliers based on it, which is a promising solution for fast distance-based outlier detection. Motivated by this observation, we propose a novel approach that exploits a proximity graph. Our approach can employ an arbitrary proximity graph and obtains a significant speed-up against state-of-the-art. However, designing an effective proximity graph raises a challenge, because existing proximity graphs do not consider efficient traversal for distance-based outlier detection. To overcome this challenge, we propose a novel proximity graph, MRPG. Our empirical study using real datasets demonstrates that MRPG detects outliers significantly faster than the state-of-the-art algorithms.<br />Comment: Accepted to SIGMOD2021

Subjects

Subjects :
Computer Science - Databases

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2110.08959
Document Type :
Working Paper