Back to Search
Start Over
Topic detection with recursive consensus clustering and semantic enrichment
- Source :
- Humanities & Social Sciences Communications, Vol 10, Iss 1, Pp 1-10 (2023)
- Publication Year :
- 2023
- Publisher :
- Springer Nature, 2023.
-
Abstract
- Abstract Extracting meaningful information from short texts like tweets has proved to be a challenging task. Literature on topic detection focuses mostly on methods that try to guess the plausible words that describe topics whose number has been decided in advance. Topics change according to the initial setup of the algorithms and show a consistent instability with words moving from one topic to another one. In this paper we propose an iterative procedure for topic detection that searches for the most stable solutions in terms of words describing a topic. We use an iterative procedure based on clustering on the consensus matrix, and traditional topic detection, to find both a stable set of words and an optimal number of topics. We observe however that in several cases the procedure does not converge to a unique value but oscillates. We further enhance the methodology using semantic enrichment via Word Embedding with the aim of reducing noise and improving topic separation. We foresee the application of this set of techniques in an automatic topic discovery in noisy channels such as Twitter or social media.
- Subjects :
- History of scholarship and learning. The humanities
AZ20-999
Social Sciences
Subjects
Details
- Language :
- English
- ISSN :
- 26629992
- Volume :
- 10
- Issue :
- 1
- Database :
- Directory of Open Access Journals
- Journal :
- Humanities & Social Sciences Communications
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.59aabb5d471b4c978da980b7da4317c4
- Document Type :
- article
- Full Text :
- https://doi.org/10.1057/s41599-023-01711-0