Back to Search Start Over

Adaptive Noise Immune Cluster Ensemble Using Affinity Propagation.

Authors :
Yu, Zhiwen
Li, Le
Liu, Jiming
Zhang, Jun
Han, Guoqiang
Source :
IEEE Transactions on Knowledge & Data Engineering. Dec2015, Vol. 27 Issue 12, p3176-3189. 14p.
Publication Year :
2015

Abstract

Cluster ensemble is one of the main branches in the ensemble learning area which is an important research focus in recent years. The objective of cluster ensemble is to combine multiple clustering solutions in a suitable way to improve the quality of the clustering result. In this paper, we design a new noise immune cluster ensemble framework named as AP^2CE<alternatives> <inline-graphic xlink:type="simple" xlink:href="yu-ieq1-2453162.gif"/></alternatives> to tackle the challenges raised by noisy datasets. AP^2CE<alternatives> <inline-graphic xlink:type="simple" xlink:href="yu-ieq2-2453162.gif"/></alternatives> not only takes advantage of the affinity propagation algorithm (AP) and the normalized cut algorithm (Ncut), but also possesses the characteristics of cluster ensemble. Compared with traditional cluster ensemble approaches, AP^2CE <alternatives><inline-graphic xlink:type="simple" xlink:href="yu-ieq3-2453162.gif"/></alternatives> is characterized by several properties. ($1$<alternatives> <inline-graphic xlink:type="simple" xlink:href="yu-ieq4-2453162.gif"/></alternatives>) It adopts multiple distance functions instead of a single Euclidean distance function to avoid the noise related to the distance function. ( $2$<alternatives><inline-graphic xlink:type="simple" xlink:href="yu-ieq5-2453162.gif"/></alternatives> ) AP^2CE<alternatives> <inline-graphic xlink:type="simple" xlink:href="yu-ieq6-2453162.gif"/></alternatives> applies AP to prune noisy attributes and generate a set of new datasets in the subspaces consists of representative attributes obtained by AP. ( $3$<alternatives><inline-graphic xlink:type="simple" xlink:href="yu-ieq7-2453162.gif"/></alternatives> ) It avoids the explicit specification of the number of clusters. ($4$ <alternatives><inline-graphic xlink:type="simple" xlink:href="yu-ieq8-2453162.gif"/></alternatives>) AP^2CE<alternatives><inline-graphic xlink:type="simple" xlink:href="yu-ieq9-2453162.gif"/> </alternatives> adopts the normalized cut algorithm as the consensus function to partition the consensus matrix and obtain the final result. In order to improve the performance of AP^2CE<alternatives><inline-graphic xlink:type="simple" xlink:href="yu-ieq10-2453162.gif"/></alternatives>, the adaptive AP^2CE<alternatives> <inline-graphic xlink:type="simple" xlink:href="yu-ieq11-2453162.gif"/></alternatives> is designed, which makes use of an adaptive process to optimize a newly designed objective function. The experiments on both synthetic and real datasets show that ($1$<alternatives><inline-graphic xlink:type="simple" xlink:href="yu-ieq12-2453162.gif"/> </alternatives>) AP^2CE<alternatives> <inline-graphic xlink:type="simple" xlink:href="yu-ieq13-2453162.gif"/></alternatives> works well on most of the datasets, in particular the noisy datasets; ($2$<alternatives> <inline-graphic xlink:type="simple" xlink:href="yu-ieq14-2453162.gif"/></alternatives>) AP^2CE<alternatives><inline-graphic xlink:type="simple" xlink:href="yu-ieq15-2453162.gif"/></alternatives> is a better choice for most of the datasets when compared with other cluster ensemble approaches; ( $3$<alternatives><inline-graphic xlink:type="simple" xlink:href="yu-ieq16-2453162.gif"/></alternatives> ) AP^2CE<alternatives> <inline-graphic xlink:type="simple" xlink:href="yu-ieq17-2453162.gif"/></alternatives> has the capability to provide more accurate, stable and robust results. [ABSTRACT FROM PUBLISHER]

Details

Language :
English
ISSN :
10414347
Volume :
27
Issue :
12
Database :
Academic Search Index
Journal :
IEEE Transactions on Knowledge & Data Engineering
Publication Type :
Academic Journal
Accession number :
110834641
Full Text :
https://doi.org/10.1109/TKDE.2015.2453162