Back to Search Start Over

Metaheuristic-based possibilistic fuzzy k-modes algorithms for categorical data clustering.

Authors :
Kuo, R.J.
Zheng, Y.R.
Nguyen, Thi Phuong Quyen
Source :
Information Sciences. May2021, Vol. 557, p1-15. 15p.
Publication Year :
2021

Abstract

Smart devices and technology applications are used in many fields. Much information is now recorded and collected rapidly so data analysis, especially clustering analysis, is vital to the process of analyzing and obtaining valuable information from datasets. However, data has different types of attributes: numerical, categorical, and mixed attributes. Some datasets also contain noise and outliers. An appropriate clustering is necessary to exploit the data structure. This study proposes a clustering algorithm that is called a possibilistic fuzzy k -modes (PFKM) algorithm. This combines the concept of possibility with the fuzzy k -modes (FKM) algorithm to address the effect of outliers and to improve the clustering results for categorical data. This study also implements three metaheuristics to increase clustering performance: a genetic algorithm (GA), a particle swarm optimization (PSO) and the sine-cosine algorithm (SCA). Three clustering algorithms are proposed: the GA-PFKM, PSO-PFKM, and SCA-PFKM algorithms. The performance of the algorithms is compared with that for the classical FKM algorithm using two indices: the sum-of-squared error (SSE) and the accuracy. The experimental results show that the PSO-PFKM and SCA-PFKM algorithms perform better for most datasets. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00200255
Volume :
557
Database :
Academic Search Index
Journal :
Information Sciences
Publication Type :
Periodical
Accession number :
149330898
Full Text :
https://doi.org/10.1016/j.ins.2020.12.051