Back to Search Start Over

A Clustering Algorithm in Stream Data Using Strong Coreset.

Authors :
Singh, Manmohan
Pamula, Rajendra
Kumar, Alok
Source :
Journal of Interconnection Networks. 2022Supplement, Vol. 22, p1-21. 21p.
Publication Year :
2022

Abstract

There are various applications of clustering in the fields of machine learning, data mining, data compression along with pattern recognition. The existent techniques like the Llyods algorithm (sometimes called k-means) were affected by the issue of the algorithm which converges to a local optimum along with no approximation guarantee. For overcoming these shortcomings, an efficient k-means clustering approach is offered by this paper for stream data mining. Coreset is a popular and fundamental concept for k-means clustering in stream data. In each step, reduction determines a coreset of inputs, and represents the error, where P represents number of input points according to nested property of coreset. Hence, a bit reduction in error of final coreset gets n times more accurate. Therefore, this motivated the author to propose a new coreset-reduction algorithm. The proposed algorithm executed on the Covertype dataset, Spambase dataset, Census 1990 dataset, Bigcross dataset, and Tower dataset. Our algorithm outperforms with competitive algorithms like Streamkm++, BICO (BIRCH meets Coresets for k-means clustering), and BIRCH (Balance Iterative Reducing and Clustering using Hierarchies. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
02192659
Volume :
22
Database :
Academic Search Index
Journal :
Journal of Interconnection Networks
Publication Type :
Academic Journal
Accession number :
159652123
Full Text :
https://doi.org/10.1142/S0219265921430118