Back to Search Start Over

Evaluation of the Optimal Topic Classification for Social Media Data Combined with Text Semantics: A Case Study of Public Opinion Analysis Related to COVID-19 with Microblogs

Authors :
Qin Liang
Chunchun Hu
Si Chen
Source :
ISPRS International Journal of Geo-Information, Vol 10, Iss 12, p 811 (2021)
Publication Year :
2021
Publisher :
MDPI AG, 2021.

Abstract

Online public opinion reflects social conditions and public attitudes regarding special social events. Therefore, analyzing the temporal and spatial distributions of online public opinion topics can contribute to understanding issues of public concern, grasping and guiding the developing trend of public opinion. However, how to evaluate the validity of classification of online public opinion remains a challenging task in the topic mining field. By combining a Bidirectional Encoder Representations from Transformers (BERT) pre-training model with the Latent Dirichlet Allocation (LDA) topic model, we propose an evaluation method to determine the optimal classification number of topics from the perspective of semantic similarity. The effectiveness of the proposed method was verified based on the standard Chinese corpus THUCNews. Taking Coronavirus Disease 2019 (COVID-19)-related geotagged posts on Weibo in Wuhan city as an example, we used the proposed method to generate five categories of public opinion topics. Combining spatial and temporal information with the classification results, we analyze the spatial and temporal distribution patterns of the five optimal public opinion topics, which are found to be consistent with the epidemic development, demonstrating the feasibility of our method when applied to practical cases.

Details

Language :
English
ISSN :
22209964
Volume :
10
Issue :
12
Database :
Directory of Open Access Journals
Journal :
ISPRS International Journal of Geo-Information
Publication Type :
Academic Journal
Accession number :
edsdoj.899baff9e7ae468680f933879d45255e
Document Type :
article
Full Text :
https://doi.org/10.3390/ijgi10120811