Back to Search Start Over

A novel Chinese–Tibetan mixed-language rumor detector with multi-extractor representations.

Authors :
Yu, Lisu
Li, Fei
Yu, Lixin
Li, Wei
Dong, Zhicheng
Cai, Donghong
Wang, Zhen
Source :
Computer Speech & Language. Aug2024, Vol. 87, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

Rumors can easily propagate through social media, posing potential threats to both individual and public health. Most existing approaches focus on single-language rumor detection, which leads to unsatisfying performance when these are applied to mixed-language rumor detection. Meanwhile, the type of mixed-language (mixture of word-level or sentence-level) is a great challenge for mixed-language rumor detection. In this paper, focusing on a mixed scene of Chinese and Tibetan, the research first provides a Chinese–Tibetan mixed-language rumor detection dataset (Weibo_Ch_Ti) that comprises 1,617 non-rumor tweets and 1,456 rumor tweets in two mixed-language types. Then, the research proposes an effective model with multi-extractors, named "MER-CTRD" for short. This model mainly consists of three extractors. The Multi-task Extractor helps the model to extract feature representations of different mixed-language types adaptively. The Rich-semantic Extractor enriches the semantic features representations of Tibetan in the Chinese–Tibetan-mixed language. The Fusion-feature Extractor fuses the mean and disparity semantic features of Chinese and Tibetan to complement feature representations of the mixed language. Finally, the research conducts experiments on Weibo_Ch_Ti. The results show that the proposed model improves accuracy by about 3%–12% over the baseline models, indicating its effectiveness in the Chinese–Tibetan mixed-language rumor detection scenario. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
08852308
Volume :
87
Database :
Academic Search Index
Journal :
Computer Speech & Language
Publication Type :
Academic Journal
Accession number :
177037989
Full Text :
https://doi.org/10.1016/j.csl.2024.101625