1. Fuzzy C-mean Missing Data Imputation for Analogy-based Effort Estimation
- Author
-
Dayang N. A. Jawawi, Ayman Jalal Hassan Almutlaq, and Adila Firdaus Arbain
- Subjects
Estimation ,General Computer Science ,Computer science ,business.industry ,Software development ,Analogy ,Missing data ,computer.software_genre ,Fuzzy logic ,Identification (information) ,Software ,Imputation (statistics) ,Data mining ,business ,computer - Abstract
The accuracy of effort estimation in one of the major factors in the success or failure of software projects. Analogy-Based Estimation (ABE) is a widely accepted estimation model since its flow human nature in selecting analogies similar in nature to the target project. The accuracy of prediction in ABE model in strongly associated with the quality of the dataset since it depends on previous completed projects for estimation. Missing Data (MD) is one of major challenges in software engineering datasets. Several missing data imputation techniques have been investigated by researchers in ABE model. Identification of the most similar donor values from the completed software projects dataset for imputation is a challenging issue in existing missing data techniques adopted for ABE model. In this study, Fuzzy C-Mean Imputation (FCMI), Mean Imputation (MI) and K-Nearest Neighbor Imputation (KNNI) are investigated to impute missing values in Desharnais dataset under different missing data percentages (Desh-Miss1, Desh-Miss2) for ABE model. FCMI-ABE technique is proposed in this study. Evaluation comparison among MI, KNNI, and (ABE-FCMI) is conducted for ABE model to identify the suitable MD imputation method. The results suggest that the use of (ABE-FCMI), rather than MI and KNNI, imputes more reliable values to incomplete software projects in the missing datasets. It was also found that the proposed imputation method significantly improves software development effort prediction of ABE model.
- Published
- 2021
- Full Text
- View/download PDF