Back to Search Start Over

Agglomerative Clustering with Threshold Optimization via Extreme Value Theory.

Authors :
Li, Chunchun
Günther, Manuel
Dhamija, Akshay Raj
Cruz, Steve
Jafarzadeh, Mohsen
Ahmad, Touqeer
Boult, Terrance E.
Source :
Algorithms. May2022, Vol. 15 Issue 5, p170. 23p.
Publication Year :
2022

Abstract

Clustering is a critical part of many tasks and, in most applications, the number of clusters in the data are unknown and must be estimated. This paper presents an Extreme Value Theory-based approach to threshold selection for clustering, proving that the "correct" linkage distances must follow a Weibull distribution for smooth feature spaces. Deep networks and their associated deep features have transformed many aspects of learning, and this paper shows they are consistent with our extreme-linkage theory and provide Unreasonable Clusterability. We show how our novel threshold selection can be applied to both classic agglomerative clustering and the more recent FINCH (First Integer Neighbor Clustering Hierarchy) algorithm. Our evaluation utilizes over a dozen different large-scale vision datasets/subsets, including multiple face-clustering datasets and ImageNet for both in-domain and, more importantly, out-of-domain object clustering. Across multiple deep features clustering tasks with very different characteristics, our novel automated threshold selection performs well, often outperforming state-of-the-art clustering techniques even when they select parameters on the test set. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
19994893
Volume :
15
Issue :
5
Database :
Academic Search Index
Journal :
Algorithms
Publication Type :
Academic Journal
Accession number :
157129948
Full Text :
https://doi.org/10.3390/a15050170