Back to Search Start Over

Fine-Grained Multimodal DeepFake Classification via Heterogeneous Graphs.

Authors :
Yin, Qilin
Lu, Wei
Cao, Xiaochun
Luo, Xiangyang
Zhou, Yicong
Huang, Jiwu
Source :
International Journal of Computer Vision. Nov2024, Vol. 132 Issue 11, p5255-5269. 15p.
Publication Year :
2024

Abstract

Nowadays, the abuse of deepfakes is a well-known issue since deepfakes can lead to severe security and privacy problems. And this situation is getting worse, as attackers are no longer limited to unimodal deepfakes, but use multimodal deepfakes, i.e., both audio forgery and video forgery, to better achieve malicious purposes. The existing unimodal or ensemble deepfake detectors are demanded with fine-grained classification capabilities for the growing technique on multimodal deepfakes. To address this gap, we propose a graph attention network based on heterogeneous graph for fine-grained multimodal deepfake classification, i.e., not only distinguishing the authenticity of samples, but also identifying the forged types, e.g., video or audio or both. To this end, we propose a positional coding-based heterogeneous graph construction method that converts an audio-visual sample into a multimodal heterogeneous graph according to relevant hyperparameters. Moreover, a cross-modal graph interaction module is devised to utilize audio-visual synchronization patterns for capturing inter-modal complementary information. The de-homogenization graph pooling operation is elaborately designed to keep differences in graph node features for enhancing the representation of graph-level features. Through the heterogeneous graph attention network, we can efficiently model intra- and inter-modal relationships of multimodal data both at spatial and temporal scales. Extensive experimental results on two audio-visual datasets FakeAVCeleb and LAV-DF demonstrate that our proposed model obtains significant performance gains as compared to other state-of-the-art competitors. The code is available at https://github.com/yinql1995/Fine-grained-Multimodal-DeepFake-Classification/. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09205691
Volume :
132
Issue :
11
Database :
Academic Search Index
Journal :
International Journal of Computer Vision
Publication Type :
Academic Journal
Accession number :
180501499
Full Text :
https://doi.org/10.1007/s11263-024-02128-1