1. A Hybrid Approach to Clustering in Very Large Databases
- Author
-
Ye Fan, Weining Qian, Shuigeng Zhou, Jin Wen, Aoying Zhou, and Hailei Qian
- Subjects
Data set ,Speedup ,Computer science ,media_common.quotation_subject ,Cluster (physics) ,Quality (business) ,Data mining ,computer.software_genre ,Hybrid approach ,Cluster analysis ,computer ,Algorithm ,media_common - Abstract
Current clustering methods always have such problems: 1) High I/O cost and expensive maintenance; 2) Pre-specifying the uncertain parameter k; 3) Lacking good efficiency in treating arbitrary shape under very large data set environment. In this paper, we first present a hybrid-clustering algorithm to solve these problems. It combines both distance and density strategies, and makes full use of statistics information while keeping good cluster quality. The experimental results show that our algorithm outperforms other popular algorithms in terms of efficiency, cost, and even get much more speedup as the data size scales up.
- Published
- 2001
- Full Text
- View/download PDF