Back to Search Start Over

On the evaluation of outlier detection and one-class classification: a comparative study of algorithms, model selection, and ensembles.

Authors :
Marques, Henrique O.
Swersky, Lorne
Sander, Jörg
Campello, Ricardo J. G. B.
Zimek, Arthur
Source :
Data Mining & Knowledge Discovery; Jul2023, Vol. 37 Issue 4, p1473-1517, 45p
Publication Year :
2023

Abstract

It has been shown that unsupervised outlier detection methods can be adapted to the one-class classification problem (Janssens and Postma, in: Proceedings of the 18th annual Belgian-Dutch on machine learning, pp 56–64, 2009; Janssens et al. in: Proceedings of the 2009 ICMLA international conference on machine learning and applications, IEEE Computer Society, pp 147–153, 2009. https://doi.org/10.1109/ICMLA.2009.16). In this paper, we focus on the comparison of one-class classification algorithms with such adapted unsupervised outlier detection methods, improving on previous comparison studies in several important aspects. We study a number of one-class classification and unsupervised outlier detection methods in a rigorous experimental setup, comparing them on a large number of datasets with different characteristics, using different performance measures. In contrast to previous comparison studies, where the models (algorithms, parameters) are selected by using examples from both classes (outlier and inlier), here we also study and compare different approaches for model selection in the absence of examples from the outlier class, which is more realistic for practical applications since labeled outliers are rarely available. Our results showed that, overall, SVDD and GMM are top-performers, regardless of whether the ground truth is used for parameter selection or not. However, in specific application scenarios, other methods exhibited better performance. Combining one-class classifiers into ensembles showed better performance than individual methods in terms of accuracy, as long as the ensemble members are properly selected. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13845810
Volume :
37
Issue :
4
Database :
Complementary Index
Journal :
Data Mining & Knowledge Discovery
Publication Type :
Academic Journal
Accession number :
164754063
Full Text :
https://doi.org/10.1007/s10618-023-00931-x