Back to Search
Start Over
Centroid-based clustering validity: method and application to quantification of optimal cluster-data space.
- Source :
-
Soft Computing - A Fusion of Foundations, Methodologies & Applications . Oct2024, Vol. 28 Issue 19, p10853-10872. 20p. - Publication Year :
- 2024
-
Abstract
- Evaluation of clustering validity to set up an optimal cluster-data space (CDS) is a vital task in many fields related to data mining. Almost existing clustering validity indexes (CVIs) lack stability due to being too sensitive to noise, especially impulse noise. Here, we (1) propose a new CVI named DzI (Dzung Index) or fRisk2 using analysis of fuzzy-set-based accumulated risk degree (FARD), and (2) present a new algorithm named fRisk2-bA for determining the optimal number of data clusters. It is a method of evaluation of the centroid-based fuzzy clustering validity. In essence, the fRisk2 still focuses on enhancing the data compression in each cluster and expanding the separation between cluster centroids. However, these features are exploited indirectly through FARD. As a result, the proposed method not only can avoid the difficulties of the traditional ones relying on the compression and separation properties directly but also can distill better local and global attributes in the data distribution to estimate the CDS more fully. Along with the proved theory basis, surveys, including the ones based on noisy datasets from measurements, showed the compared advantages of fRisk2 as follows. (1) The accuracy, stability, and convergence of the fRisk2 are outstanding. (2) Its total calculating cost is lower than the other surveyed CVIs. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 14327643
- Volume :
- 28
- Issue :
- 19
- Database :
- Academic Search Index
- Journal :
- Soft Computing - A Fusion of Foundations, Methodologies & Applications
- Publication Type :
- Academic Journal
- Accession number :
- 180373734
- Full Text :
- https://doi.org/10.1007/s00500-024-09871-0