1. Detecting Meaningful Clusters From High-Dimensional Data: A Strongly Consistent Sparse Center-Based Clustering Approach.
- Author
-
Chakraborty, Saptarshi and Das, Swagatam
- Subjects
- *
FEATURE selection , *HIGH-dimensional model representation , *WEIGHT gain , *FEATURE extraction , *NOISE measurement , *ALGORITHMS , *K-means clustering - Abstract
In context to high-dimensional clustering, the concept of feature weighting has gained considerable importance over the years to capture the relative degrees of importance of different features in revealing the cluster structure of the dataset. However, the popular techniques in this area either fail to perform feature selection or do not preserve the simplicity of Lloyd’s heuristic to solve the $k$ k -means problem and the like. In this paper, we propose a Lasso Weighted $k$ k -means ($LW$ L W - $k$ k -means) algorithm, as a simple yet efficient sparse clustering procedure for high-dimensional data where the number of features ($p$ p ) can be much higher than the number of observations ($n$ n ). The $LW$ L W - $k$ k -means method imposes an $\ell _1$ ℓ 1 regularization term involving the feature weights directly to induce feature selection in a sparse clustering framework. We develop a simple block-coordinate descent type algorithm with time-complexity resembling that of Lloyd’s method, to optimize the proposed objective. In addition, we establish the strong consistency of the $LW$ L W - $k$ k -means procedure. Such an analysis of the large sample properties is not available for the conventional sparse $k$ k -means algorithms, in general. $LW$ L W - $k$ k -means is tested on a number of synthetic and real-life datasets and through a detailed experimental analysis, we find that the performance of the method is highly competitive against the baselines as well as the state-of-the-art procedures for center-based high-dimensional clustering, not only in terms of clustering accuracy but also with respect to computational time. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF