Back to Search Start Over

An improved ant algorithm with LDA-based representation for text document clustering.

Authors :
Onan, Aytug
Bulut, Hasan
Korukoglu, Serdar
Source :
Journal of Information Science. Apr2017, Vol. 43 Issue 2, p275-292. 18p.
Publication Year :
2017

Abstract

Document clustering can be applied in document organisation and browsing, document summarisation and classification. The identification of an appropriate representation for textual documents is extremely important for the performance of clustering or classification algorithms. Textual documents suffer from the high dimensionality and irrelevancy of text features. Besides, conventional clustering algorithms suffer from several shortcomings, such as slow convergence and sensitivity to the initial value. To tackle the problems of conventional clustering algorithms, metaheuristic algorithms are frequently applied to clustering. In this paper, an improved ant clustering algorithm is presented, where two novel heuristic methods are proposed to enhance the clustering quality of ant-based clustering. In addition, the latent Dirichlet allocation (LDA) is used to represent textual documents in a compact and efficient way. The clustering quality of the proposed ant clustering algorithm is compared to the conventional clustering algorithms using 25 text benchmarks in terms of F-measure values. The experimental results indicate that the proposed clustering scheme outperforms the compared conventional and metaheuristic clustering methods for textual documents. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01655515
Volume :
43
Issue :
2
Database :
Academic Search Index
Journal :
Journal of Information Science
Publication Type :
Academic Journal
Accession number :
121615539
Full Text :
https://doi.org/10.1177/0165551516638784