1. Revolutionizing Missing Data Handling with RFKFCM: Random Forest-based Kernelized Fuzzy C-Means.
- Author
-
Jyoti, Singh, Jaspreeti, and Gosain, Anjana
- Subjects
STANDARD deviations ,MISSING data (Statistics) ,MULTIPLE imputation (Statistics) ,RANDOM forest algorithms - Abstract
Missing values are a prevalent issue, frequently leading to a considerable decline in the quality of data. Therefore, it becomes imperative to adeptly manage missing data. This study presents a technique for imputing missing values that incorporates Kernelized Fuzzy C-Means (KFCM) clustering. The technique introduced is termed RFKFCM, which combines the benefits of Random Forest (RF) and KFCM algorithms. The proposed RFKFCM's performance is evaluated through a comparative analysis with five state-of-the-art imputation techniques (mean, LI, KNN, FCM, and LIFCM) across four widely used real-world datasets obtained from the UCI repository. Additionally, experiments investigating the impact of missing values affirm the robustness of the proposed technique across different missing rates using Friedman's statistical test for each dataset by comparing their mean rank. The experimental results indicate that our proposed technique performs significantly better than the existing imputation techniques based on Root mean squared error (RMSE) & Mean absolute error (MAE) for these datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF