Back to Search Start Over

Clustering Through Probability Distribution Analysis Along Eigenpaths.

Authors :
Yang, Wenming
Hui, Changqing
Sun, Daren
Sun, Xiang
Liao, Qingmin
Source :
IEEE Transactions on Systems, Man & Cybernetics. Systems. Feb2021, Vol. 51 Issue 2, p875-884. 10p.
Publication Year :
2021

Abstract

Data clustering is one of the most fundamental techniques in exploratory data analysis. It is widely used for determining the underlying data structure, classifying natural data and compressing data in engineering, business management, social statistics, computer science, and medicine. Under the assumption that clusters are high density regions in the feature space separated by relatively low density neighbors, a novel approach is proposed for modeling any high dimensional clustering problem as a one-dimensional analysis of the probability distribution. First, a special path between two vertexes, namely eigenpath, is defined in this paper to represent their close connection. Second, we propose the connectedness index based on the eigenpath for quantitatively describing the connection between two vertexes. Third, the connectedness index is applied to the candidates of cluster centers and measures the connection between different candidates. Then an indicative curve can be drawn with the knowledge of connectedness index. This approach not only provides effective indicative curve for unknown data sets but also facilitates eliminating the curse of dimensionality partly as well as correctly recognizes arbitrary cluster forms and automatically excludes outliers. Extensive experiments showed the effectiveness and efficiency of the proposed approach. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
21682216
Volume :
51
Issue :
2
Database :
Academic Search Index
Journal :
IEEE Transactions on Systems, Man & Cybernetics. Systems
Publication Type :
Academic Journal
Accession number :
148208191
Full Text :
https://doi.org/10.1109/TSMC.2018.2884839