Back to Search Start Over

A Semantic Enhancement Framework for Multimodal Sarcasm Detection

Authors :
Weiyu Zhong
Zhengxuan Zhang
Qiaofeng Wu
Yun Xue
Qianhua Cai
Source :
Mathematics, Vol 12, Iss 2, p 317 (2024)
Publication Year :
2024
Publisher :
MDPI AG, 2024.

Abstract

Sarcasm represents a language form where a discrepancy lies between the literal meanings and implied intention. Sarcasm detection is challenging with unimodal text without clearly understanding the context, based on which multimodal information is introduced to benefit detection. However, current approaches only focus on modeling text–image incongruity at the token level and use the incongruity as the key to detection, ignoring the significance of the overall multimodal features and textual semantics during processing. Moreover, semantic information from other samples with a similar manner of expression also facilitates sarcasm detection. In this work, a semantic enhancement framework is proposed to address image–text congruity by modeling textual and visual information at the multi-scale and multi-span token level. The efficacy of textual semantics in multimodal sarcasm detection is pronounced. Aiming to bridge the cross-modal semantic gap, semantic enhancement is performed by using a multiple contrastive learning strategy. Experiments were conducted on a benchmark dataset. Our model outperforms the latest baseline by 1.87% in terms of the F1-score and 1% in terms of accuracy.

Details

Language :
English
ISSN :
22277390
Volume :
12
Issue :
2
Database :
Directory of Open Access Journals
Journal :
Mathematics
Publication Type :
Academic Journal
Accession number :
edsdoj.321b92dff8f74ca5a6c8354fcc5ca8df
Document Type :
article
Full Text :
https://doi.org/10.3390/math12020317