Back to Search Start Over

Beyond cluster labeling: Semantic interpretation of clusters’ contents using a graph representation

Authors :
Mohamed Nadif
François Role
Source :
Knowledge-Based Systems. 56:141-155
Publication Year :
2014
Publisher :
Elsevier BV, 2014.

Abstract

Efficient clustering algorithms have been developed to automatically group documents into subgroups (clusters). Once clustering has been performed, an important additional step is to help users make sense of the obtained clusters. Existing methods address this issue by assigning to each cluster a flat list of descriptive terms (labels) that are extracted from the documents, most often using statistical techniques borrowed from the field of feature selection or reduction. A limitation of these unstructured descriptions of clusters' contents is that they do not account for the meaningful relationships between the terms. In contrast, we propose a graph representation, which makes the clusters easier to interpret by putting the descriptive terms in context, and by performing some simple network analysis. Our experiments reveal that the proposed method allows for a deeper level of interpretation, both when the clusters under study are homogeneous and when they are heterogeneous. In addition, evaluation procedures presented in the paper show that the graph-based representation of each cluster, while being very synthetic, still quite faithfully reflects the original content of the cluster.

Details

ISSN :
09507051
Volume :
56
Database :
OpenAIRE
Journal :
Knowledge-Based Systems
Accession number :
edsair.doi...........efc39b58c052508cb1f77bbd0db1d0ed
Full Text :
https://doi.org/10.1016/j.knosys.2013.11.005