Back to Search Start Over

Centroid-based clustering validity: method and application to quantification of optimal cluster-data space.

Authors :
Nguyen, Sy Dzung
Source :
Soft Computing - A Fusion of Foundations, Methodologies & Applications. Oct2024, Vol. 28 Issue 19, p10853-10872. 20p.
Publication Year :
2024

Abstract

Evaluation of clustering validity to set up an optimal cluster-data space (CDS) is a vital task in many fields related to data mining. Almost existing clustering validity indexes (CVIs) lack stability due to being too sensitive to noise, especially impulse noise. Here, we (1) propose a new CVI named DzI (Dzung Index) or fRisk2 using analysis of fuzzy-set-based accumulated risk degree (FARD), and (2) present a new algorithm named fRisk2-bA for determining the optimal number of data clusters. It is a method of evaluation of the centroid-based fuzzy clustering validity. In essence, the fRisk2 still focuses on enhancing the data compression in each cluster and expanding the separation between cluster centroids. However, these features are exploited indirectly through FARD. As a result, the proposed method not only can avoid the difficulties of the traditional ones relying on the compression and separation properties directly but also can distill better local and global attributes in the data distribution to estimate the CDS more fully. Along with the proved theory basis, surveys, including the ones based on noisy datasets from measurements, showed the compared advantages of fRisk2 as follows. (1) The accuracy, stability, and convergence of the fRisk2 are outstanding. (2) Its total calculating cost is lower than the other surveyed CVIs. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
14327643
Volume :
28
Issue :
19
Database :
Academic Search Index
Journal :
Soft Computing - A Fusion of Foundations, Methodologies & Applications
Publication Type :
Academic Journal
Accession number :
180373734
Full Text :
https://doi.org/10.1007/s00500-024-09871-0