Back to Search
Start Over
Trimmed scores regression for k-means clustering data with high-missing ratio.
- Source :
- Communications in Statistics: Simulation & Computation; 2024, Vol. 53 Issue 6, p2805-2821, 17p
- Publication Year :
- 2024
-
Abstract
- Data sets with missing values bring great challenges to k-means clustering (KMC). At present, most studies focus on KMC data with low missing ratio while few studies on KMC data with high missing ratio. The current imputation methods have the following problems when dealing with the KMC data: (1) the error between imputation value and original true value is large, which leads to the poor imputation precision; (2) the imputation results have a great influence on the clustering results, which reduce the accuracies of the clustering results. We propose a novel imputation method, to deal with the problems, called as trimmed scores regression (TSR), which obtains an imputation estimator from a regression equation with a trimmed score matrix, and a novel cluster with k-means method. Compared with other imputation methods in numerical analysis, the TSR method exhibits better performance. [ABSTRACT FROM AUTHOR]
- Subjects :
- K-means clustering
MULTIPLE imputation (Statistics)
NUMERICAL analysis
Subjects
Details
- Language :
- English
- ISSN :
- 03610918
- Volume :
- 53
- Issue :
- 6
- Database :
- Complementary Index
- Journal :
- Communications in Statistics: Simulation & Computation
- Publication Type :
- Academic Journal
- Accession number :
- 178068613
- Full Text :
- https://doi.org/10.1080/03610918.2022.2091779