Back to Search Start Over

Fast and scalable support vector clustering for large-scale data analysis

Authors :
Yun Feng Chang
Zhili Zhang
Ying Jie Tian
Yi Xian Yang
Yuan Ping
Yajian Zhou
Source :
Knowledge and Information Systems. 43:281-310
Publication Year :
2014
Publisher :
Springer Science and Business Media LLC, 2014.

Abstract

As an important boundary-based clustering algorithm, support vector clustering (SVC) benefits multiple applications for its capability of handling arbitrary cluster shapes. However, its popularity is degraded by both its highly intensive pricey computation and poor label performance which are due to redundant kernel function matrix required by estimating a support function and ineffectively checking segmers between all pairs of data points, respectively. To address these two problems, a fast and scalable SVC (FSSVC) method is proposed in this paper to achieve significant improvement on efficiency while guarantees a comparable accuracy with the state-of-the-art methods. The heart of our approach includes (1) constructing the hypersphere and support function by cluster boundaries which prunes unnecessary computation and storage of kernel functions and (2) presenting an adaptive labeling strategy which decomposes clusters into convex hulls and then employs a convex-decomposition-based cluster labeling algorithm or cone cluster labeling algorithm on the basis of whether the radius of the hypersphere is greater than 1. Both theoretical analysis and experimental results (e.g., the first rank of a nonparametric statistical test) show the superiority of our method over the others, especially for large-scale data analysis under limited memory requirements.

Details

ISSN :
02193116 and 02191377
Volume :
43
Database :
OpenAIRE
Journal :
Knowledge and Information Systems
Accession number :
edsair.doi...........e169682edf4c8c5ddd2442768cc365d2
Full Text :
https://doi.org/10.1007/s10115-013-0724-9