1. Statistical analysis of a hierarchical clustering algorithm with outliers.
- Author
-
Klutchnikoff, Nicolas, Poterie, Audrey, and Rouvière, Laurent
- Subjects
- *
STATISTICS , *CLUSTER analysis (Statistics) , *HIERARCHICAL clustering (Cluster analysis) , *ALGORITHMS - Abstract
It is well known that, in the presence of outliers, the single linkage algorithm generally fails to identify clusters. In this paper, we construct a new version of this algorithm, less sensitive to outliers, and study both its theoretical properties and its practical behavior. In particular, we provide an oracle-type inequality which guarantees that our procedure recovers clusters with high probability under mild assumptions on the distribution of the outliers. Using this inequality, we prove the consistency of our method and exhibit rates of convergence in various situations. The performance of this approach is also assessed through simulation studies. A thorough comparison with several classical clustering algorithms on simulated data is presented. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF