Back to Search Start Over

Automated granule discovery in continuous data for feature selection.

Authors :
Sewwandi, M.A.N.D.
Li, Yuefeng
Zhang, Jinglan
Source :
Information Sciences. Nov2021, Vol. 578, p323-343. 21p.
Publication Year :
2021

Abstract

Real-world database applications possess massive data collections with different data formats such as continuous, discrete or nominal. Continuous data makes the analysis process more complex as the data can take any value within a particular range and so granule mining has been used recently with techniques such as neighbourhood rough sets to discover granules in continuous data. This approach is yet to address the granule resolution design concepts, so this paper presents a novel method, Hierarchical Clustering-based Granulation (HCluG) to improve the granule identification of continuous data by combining hierarchical clustering with neighborhood rough sets, reducing user involvement in granule resolution parameters tuning and introducing an automated granule discovery method. HCluG comprises a feature selection method to evaluate the quality of the granules generated with the proposed granule approximations. Experimental results show HCluG reduces the number of selected features while improving the classification performance. HCluG outperforms the rough sets-based feature selection baselines when used with K-Nearest Neighbours and Radial Basis Function Support Vector Machine on average and performs better on average than using the complete feature set. This method can be used in data analysis to achieve high classification performance with a fewer number of features and less user involvement. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00200255
Volume :
578
Database :
Academic Search Index
Journal :
Information Sciences
Publication Type :
Periodical
Accession number :
152901061
Full Text :
https://doi.org/10.1016/j.ins.2021.07.042