Back to Search Start Over

Improving topic modeling performance on social media through semantic relationships within biomedical terminology.

Authors :
Xin, Yi
Grabowska, Monika E.
Gangireddy, Srushti
Krantz, Matthew S.
Kerchberger, V. Eric
Dickson, Alyson L.
Feng, Qiping
Yin, Zhijun
Wei, Wei-Qi
Source :
PLoS ONE; 2/21/2025, Vol. 20 Issue 2, p1-16, 16p
Publication Year :
2025

Abstract

Topic modeling utilizes unsupervised machine learning to detect underlying themes within texts and has been deployed routinely to analyze social media for insights into healthcare issues. However, the inherent messiness of social media hinders the full realization of this technique's potential. As such, we hypothesized that restricting medical concepts in social media texts to specific related semantic types and applying topic modeling to these concepts could be a feasible approach to overcome the challenge of traditional topic modeling for social media texts. Therefore, we developed a semantic-type-based topic modeling pipeline to discover self-reported health-related topics. This pipeline integrated semantic type information and Systematized Medical Nomenclature for Medicine (SNOMED) precoordinated expressions into a traditional topic modeling approach to enhance effectiveness in clustering meaningful, distinct topics. Using social media texts regarding statins for illustration, we evaluated the efficacy of this new approach and validated a newly identified topic using real-world clinical data. Based on expert evaluations, this approach resulted in more novel, distinguishable, and meaningful health-related topics compared to traditional topic modeling. In addition, our electronic health record validation for a newly identified topic in two real-world clinical databases indicated that statin users had a higher prevalence of depression or anxiety compared to matched non-users. Our results indicate that this new topic modeling pipeline can improve the extraction of themes from noisy online discussions, thereby contributing to deeper insights for healthcare research. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
19326203
Volume :
20
Issue :
2
Database :
Complementary Index
Journal :
PLoS ONE
Publication Type :
Academic Journal
Accession number :
183201869
Full Text :
https://doi.org/10.1371/journal.pone.0318702