Back to Search Start Over

A Supervised Clustering and Classification Algorithm for Mining Data With Mixed Variables.

Authors :
Xiangyang Li
Nong Ye
Source :
IEEE Transactions on Systems, Man & Cybernetics: Part A. Mar2006, Vol. 36 Issue 2, p396-406. 11p. 2 Black and White Photographs, 1 Diagram, 4 Charts, 6 Graphs.
Publication Year :
2006

Abstract

This paper presents a data mining algorithm based on supervised clustering to learn data patterns and use these patterns for data classification. This algorithm enables a scalable incremental learning of patterns from data with both numeric and nominal variables. Two different methods of combining numeric and nominal variables in calculating the distance between clusters are investigated. In one method, separate distance measures are calculated for numeric and nominal variables, respectively, and are then combined into an overall distance measure. In another method, nominal variables are converted into numeric variables, and then a distance measure is calculated using all variables. We analyze the computational complexity, and thus, the scalability, of the algorithm, and test its performance on a number of data sets from various application domains. The prediction accuracy and reliability of the algorithm are analyzed, tested, and compared with those of several other data mining algorithms. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10834427
Volume :
36
Issue :
2
Database :
Academic Search Index
Journal :
IEEE Transactions on Systems, Man & Cybernetics: Part A
Publication Type :
Academic Journal
Accession number :
20414389
Full Text :
https://doi.org/10.1109/TSMCA.2005.853501