Back to Search Start Over

SOTXTSTREAM: Density-based self-organizing clustering of text streams.

Authors :
Bryant, Avory C.
Cios, Krzysztof J.
Source :
PLoS ONE; 7/7/2017, Vol. 12 Issue 7, p1-25, 25p
Publication Year :
2017

Abstract

A streaming data clustering algorithm is presented building upon the density-based self-organizing stream clustering algorithm SOSTREAM. Many density-based clustering algorithms are limited by their inability to identify clusters with heterogeneous density. SOSTREAM addresses this limitation through the use of local (nearest neighbor-based) density determinations. Additionally, many stream clustering algorithms use a two-phase clustering approach. In the first phase, a micro-clustering solution is maintained online, while in the second phase, the micro-clustering solution is clustered offline to produce a macro solution. By performing self-organization techniques on micro-clusters in the online phase, SOSTREAM is able to maintain a macro clustering solution in a single phase. Leveraging concepts from SOSTREAM, a new density-based self-organizing text stream clustering algorithm, SOTXTSTREAM, is presented that addresses several shortcomings of SOSTREAM. Gains in clustering performance of this new algorithm are demonstrated on several real-world text stream datasets. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
19326203
Volume :
12
Issue :
7
Database :
Complementary Index
Journal :
PLoS ONE
Publication Type :
Academic Journal
Accession number :
123995128
Full Text :
https://doi.org/10.1371/journal.pone.0180543