Back to Search
Start Over
A novel Chinese–Tibetan mixed-language rumor detector with multi-extractor representations.
- Source :
-
Computer Speech & Language . Aug2024, Vol. 87, pN.PAG-N.PAG. 1p. - Publication Year :
- 2024
-
Abstract
- Rumors can easily propagate through social media, posing potential threats to both individual and public health. Most existing approaches focus on single-language rumor detection, which leads to unsatisfying performance when these are applied to mixed-language rumor detection. Meanwhile, the type of mixed-language (mixture of word-level or sentence-level) is a great challenge for mixed-language rumor detection. In this paper, focusing on a mixed scene of Chinese and Tibetan, the research first provides a Chinese–Tibetan mixed-language rumor detection dataset (Weibo_Ch_Ti) that comprises 1,617 non-rumor tweets and 1,456 rumor tweets in two mixed-language types. Then, the research proposes an effective model with multi-extractors, named "MER-CTRD" for short. This model mainly consists of three extractors. The Multi-task Extractor helps the model to extract feature representations of different mixed-language types adaptively. The Rich-semantic Extractor enriches the semantic features representations of Tibetan in the Chinese–Tibetan-mixed language. The Fusion-feature Extractor fuses the mean and disparity semantic features of Chinese and Tibetan to complement feature representations of the mixed language. Finally, the research conducts experiments on Weibo_Ch_Ti. The results show that the proposed model improves accuracy by about 3%–12% over the baseline models, indicating its effectiveness in the Chinese–Tibetan mixed-language rumor detection scenario. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 08852308
- Volume :
- 87
- Database :
- Academic Search Index
- Journal :
- Computer Speech & Language
- Publication Type :
- Academic Journal
- Accession number :
- 177037989
- Full Text :
- https://doi.org/10.1016/j.csl.2024.101625