Author: "Jia, Xiaojun" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Jia, Xiaojun"' showing total 488 results

Start Over Author "Jia, Xiaojun"

488 results on '"Jia, Xiaojun"'

1. Global Challenge for Safe and Secure LLMs Track 1

Author: Jia, Xiaojun, Huang, Yihao, Liu, Yang, Tan, Peng Yan, Yau, Weng Kuan, Mak, Mun-Thye, Sim, Xin Ming, Ng, Wee Siong, Ng, See Kiong, Liu, Hanqing, Zhou, Lifeng, Yan, Huanqian, Sun, Xiaobing, Liu, Wei, Wang, Long, Qian, Yiming, Liu, Yong, Yang, Junxiao, Zhang, Zhexin, Lei, Leqi, Chen, Renmiao, Lu, Yida, Cui, Shiyao, Wang, Zizhou, Li, Shaohua, Wang, Yan, Goh, Rick Siow Mong, Zhen, Liangli, Zhang, Yingjie, and Zhao, Zhe
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence, Computer Science - Computers and Society
Abstract: This paper introduces the Global Challenge for Safe and Secure Large Language Models (LLMs), a pioneering initiative organized by AI Singapore (AISG) and the CyberSG R&D Programme Office (CRPO) to foster the development of advanced defense mechanisms against automated jailbreaking attacks. With the increasing integration of LLMs in critical sectors such as healthcare, finance, and public administration, ensuring these models are resilient to adversarial attacks is vital for preventing misuse and upholding ethical standards. This competition focused on two distinct tracks designed to evaluate and enhance the robustness of LLM security frameworks. Track 1 tasked participants with developing automated methods to probe LLM vulnerabilities by eliciting undesirable responses, effectively testing the limits of existing safety protocols within LLMs. Participants were challenged to devise techniques that could bypass content safeguards across a diverse array of scenarios, from offensive language to misinformation and illegal activities. Through this process, Track 1 aimed to deepen the understanding of LLM vulnerabilities and provide insights for creating more resilient models.
Published: 2024

2. MRJ-Agent: An Effective Jailbreak Agent for Multi-Round Dialogue

Author: Wang, Fengxiang, Duan, Ranjie, Xiao, Peng, Jia, Xiaojun, Chen, YueFeng, Wang, Chongwen, Tao, Jialing, Su, Hang, Zhu, Jun, and Xue, Hui
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Cryptography and Security
Abstract: Large Language Models (LLMs) demonstrate outstanding performance in their reservoir of knowledge and understanding capabilities, but they have also been shown to be prone to illegal or unethical reactions when subjected to jailbreak attacks. To ensure their responsible deployment in critical applications, it is crucial to understand the safety capabilities and vulnerabilities of LLMs. Previous works mainly focus on jailbreak in single-round dialogue, overlooking the potential jailbreak risks in multi-round dialogues, which are a vital way humans interact with and extract information from LLMs. Some studies have increasingly concentrated on the risks associated with jailbreak in multi-round dialogues. These efforts typically involve the use of manually crafted templates or prompt engineering techniques. However, due to the inherent complexity of multi-round dialogues, their jailbreak performance is limited. To solve this problem, we propose a novel multi-round dialogue jailbreaking agent, emphasizing the importance of stealthiness in identifying and mitigating potential threats to human values posed by LLMs. We propose a risk decomposition strategy that distributes risks across multiple rounds of queries and utilizes psychological strategies to enhance attack strength. Extensive experiments show that our proposed method surpasses other attack methods and achieves state-of-the-art attack success rate. We will make the corresponding code and dataset available for future research. The code will be released soon.
Published: 2024

3. Semantic-Aligned Adversarial Evolution Triangle for High-Transferability Vision-Language Attack

Author: Jia, Xiaojun, Gao, Sensen, Guo, Qing, Ma, Ke, Huang, Yihao, Qin, Simeng, Liu, Yang, Fellow, Ivor Tsang, and Cao, Xiaochun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Vision-language pre-training (VLP) models excel at interpreting both images and text but remain vulnerable to multimodal adversarial examples (AEs). Advancing the generation of transferable AEs, which succeed across unseen models, is key to developing more robust and practical VLP models. Previous approaches augment image-text pairs to enhance diversity within the adversarial example generation process, aiming to improve transferability by expanding the contrast space of image-text features. However, these methods focus solely on diversity around the current AEs, yielding limited gains in transferability. To address this issue, we propose to increase the diversity of AEs by leveraging the intersection regions along the adversarial trajectory during optimization. Specifically, we propose sampling from adversarial evolution triangles composed of clean, historical, and current adversarial examples to enhance adversarial diversity. We provide a theoretical analysis to demonstrate the effectiveness of the proposed adversarial evolution triangle. Moreover, we find that redundant inactive dimensions can dominate similarity calculations, distorting feature matching and making AEs model-dependent with reduced transferability. Hence, we propose to generate AEs in the semantic image-text feature contrast space, which can project the original feature space into a semantic corpus subspace. The proposed semantic-aligned subspace can reduce the image feature redundancy, thereby improving adversarial transferability. Extensive experiments across different datasets and models demonstrate that the proposed method can effectively improve adversarial transferability and outperform state-of-the-art adversarial attack methods. The code is released at https://github.com/jiaxiaojunQAQ/SA-AET.
Published: 2024

4. CleanerCLIP: Fine-grained Counterfactual Semantic Augmentation for Backdoor Defense in Contrastive Learning

Author: Xun, Yuan, Liang, Siyuan, Jia, Xiaojun, Liu, Xinwei, and Cao, Xiaochun
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Pre-trained large models for multimodal contrastive learning, such as CLIP, have been widely recognized in the industry as highly susceptible to data-poisoned backdoor attacks. This poses significant risks to downstream model training. In response to such potential threats, finetuning offers a simpler and more efficient defense choice compared to retraining large models with augmented data. In the supervised learning domain, fine-tuning defense strategies can achieve excellent defense performance. However, in the unsupervised and semi-supervised domain, we find that when CLIP faces some complex attack techniques, the existing fine-tuning defense strategy, CleanCLIP, has some limitations on defense performance. The synonym substitution of its text-augmentation is insufficient to enhance the text feature space. To compensate for this weakness, we improve it by proposing a fine-grained \textbf{T}ext \textbf{A}lignment \textbf{C}leaner (TA-Cleaner) to cut off feature connections of backdoor triggers. We randomly select a few samples for positive and negative subtext generation at each epoch of CleanCLIP, and align the subtexts to the images to strengthen the text self-supervision. We evaluate the effectiveness of our TA-Cleaner against six attack algorithms and conduct comprehensive zero-shot classification tests on ImageNet1K. Our experimental results demonstrate that TA-Cleaner achieves state-of-the-art defensiveness among finetuning-based defense techniques. Even when faced with the novel attack technique BadCLIP, our TA-Cleaner outperforms CleanCLIP by reducing the ASR of Top-1 and Top-10 by 52.02\% and 63.88\%, respectively.
Published: 2024

5. RT-Attack: Jailbreaking Text-to-Image Models via Random Token

Author: Gao, Sensen, Jia, Xiaojun, Huang, Yihao, Duan, Ranjie, Gu, Jindong, Liu, Yang, and Guo, Qing
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Cryptography and Security
Abstract: Recently, Text-to-Image(T2I) models have achieved remarkable success in image generation and editing, yet these models still have many potential issues, particularly in generating inappropriate or Not-Safe-For-Work(NSFW) content. Strengthening attacks and uncovering such vulnerabilities can advance the development of reliable and practical T2I models. Most of the previous works treat T2I models as white-box systems, using gradient optimization to generate adversarial prompts. However, accessing the model's gradient is often impossible in real-world scenarios. Moreover, existing defense methods, those using gradient masking, are designed to prevent attackers from obtaining accurate gradient information. While some black-box jailbreak attacks have been explored, these typically rely on simply replacing sensitive words, leading to suboptimal attack performance. To address this issue, we introduce a two-stage query-based black-box attack method utilizing random search. In the first stage, we establish a preliminary prompt by maximizing the semantic similarity between the adversarial and target harmful prompts. In the second stage, we use this initial prompt to refine our approach, creating a detailed adversarial prompt aimed at jailbreaking and maximizing the similarity in image features between the images generated from this prompt and those produced by the target harmful prompt. Extensive experiments validate the effectiveness of our method in attacking the latest prompt checkers, post-hoc image checkers, securely trained T2I models, and online commercial models.
Published: 2024

6. Perception-guided Jailbreak against Text-to-Image Models

Author: Huang, Yihao, Liang, Le, Li, Tianlin, Jia, Xiaojun, Wang, Run, Miao, Weikai, Pu, Geguang, and Liu, Yang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In recent years, Text-to-Image (T2I) models have garnered significant attention due to their remarkable advancements. However, security concerns have emerged due to their potential to generate inappropriate or Not-Safe-For-Work (NSFW) images. In this paper, inspired by the observation that texts with different semantics can lead to similar human perceptions, we propose an LLM-driven perception-guided jailbreak method, termed PGJ. It is a black-box jailbreak method that requires no specific T2I model (model-free) and generates highly natural attack prompts. Specifically, we propose identifying a safe phrase that is similar in human perception yet inconsistent in text semantics with the target unsafe word and using it as a substitution. The experiments conducted on six open-source models and commercial online services with thousands of prompts have verified the effectiveness of PGJ., Comment: 8 pages
Published: 2024

7. Multimodal Unlearnable Examples: Protecting Data against Multimodal Contrastive Learning

Author: Liu, Xinwei, Jia, Xiaojun, Xun, Yuan, Liang, Siyuan, and Cao, Xiaochun
Subjects: Computer Science - Multimedia, Computer Science - Cryptography and Security
Abstract: Multimodal contrastive learning (MCL) has shown remarkable advances in zero-shot classification by learning from millions of image-caption pairs crawled from the Internet. However, this reliance poses privacy risks, as hackers may unauthorizedly exploit image-text data for model training, potentially including personal and privacy-sensitive information. Recent works propose generating unlearnable examples by adding imperceptible perturbations to training images to build shortcuts for protection. However, they are designed for unimodal classification, which remains largely unexplored in MCL. We first explore this context by evaluating the performance of existing methods on image-caption pairs, and they do not generalize effectively to multimodal data and exhibit limited impact to build shortcuts due to the lack of labels and the dispersion of pairs in MCL. In this paper, we propose Multi-step Error Minimization (MEM), a novel optimization process for generating multimodal unlearnable examples. It extends the Error-Minimization (EM) framework to optimize both image noise and an additional text trigger, thereby enlarging the optimized space and effectively misleading the model to learn the shortcut between the noise features and the text trigger. Specifically, we adopt projected gradient descent to solve the noise minimization problem and use HotFlip to approximate the gradient and replace words to find the optimal text trigger. Extensive experiments demonstrate the effectiveness of MEM, with post-protection retrieval results nearly half of random guessing, and its high transferability across different models. Our code is available on the https://github.com/thinwayliu/Multimodal-Unlearnable-Examples, Comment: ACM MM2024
Published: 2024

8. Texture Re-scalable Universal Adversarial Perturbation

Author: Huang, Yihao, Guo, Qing, Juefei-Xu, Felix, Hu, Ming, Jia, Xiaojun, Cao, Xiaochun, Pu, Geguang, and Liu, Yang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Universal adversarial perturbation (UAP), also known as image-agnostic perturbation, is a fixed perturbation map that can fool the classifier with high probabilities on arbitrary images, making it more practical for attacking deep models in the real world. Previous UAP methods generate a scale-fixed and texture-fixed perturbation map for all images, which ignores the multi-scale objects in images and usually results in a low fooling ratio. Since the widely used convolution neural networks tend to classify objects according to semantic information stored in local textures, it seems a reasonable and intuitive way to improve the UAP from the perspective of utilizing local contents effectively. In this work, we find that the fooling ratios significantly increase when we add a constraint to encourage a small-scale UAP map and repeat it vertically and horizontally to fill the whole image domain. To this end, we propose texture scale-constrained UAP (TSC-UAP), a simple yet effective UAP enhancement method that automatically generates UAPs with category-specific local textures that can fool deep models more easily. Through a low-cost operation that restricts the texture scale, TSC-UAP achieves a considerable improvement in the fooling ratio and attack transferability for both data-dependent and data-free UAP methods. Experiments conducted on two state-of-the-art UAP methods, eight popular CNN models and four classical datasets show the remarkable performance of TSC-UAP., Comment: 14 pages (accepted by TIFS2024)
Published: 2024

9. Improved Techniques for Optimization-Based Jailbreaking on Large Language Models

Author: Jia, Xiaojun, Pang, Tianyu, Du, Chao, Huang, Yihao, Gu, Jindong, Liu, Yang, Cao, Xiaochun, and Lin, Min
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language, Computer Science - Cryptography and Security
Abstract: Large language models (LLMs) are being rapidly developed, and a key component of their widespread deployment is their safety-related alignment. Many red-teaming efforts aim to jailbreak LLMs, where among these efforts, the Greedy Coordinate Gradient (GCG) attack's success has led to a growing interest in the study of optimization-based jailbreaking techniques. Although GCG is a significant milestone, its attacking efficiency remains unsatisfactory. In this paper, we present several improved (empirical) techniques for optimization-based jailbreaks like GCG. We first observe that the single target template of "Sure" largely limits the attacking performance of GCG; given this, we propose to apply diverse target templates containing harmful self-suggestion and/or guidance to mislead LLMs. Besides, from the optimization aspects, we propose an automatic multi-coordinate updating strategy in GCG (i.e., adaptively deciding how many tokens to replace in each step) to accelerate convergence, as well as tricks like easy-to-hard initialisation. Then, we combine these improved technologies to develop an efficient jailbreak method, dubbed I-GCG. In our experiments, we evaluate on a series of benchmarks (such as NeurIPS 2023 Red Teaming Track). The results demonstrate that our improved techniques can help GCG outperform state-of-the-art jailbreaking attacks and achieve nearly 100% attack success rate. The code is released at https://github.com/jiaxiaojunQAQ/I-GCG.
Published: 2024

10. Text Modality Oriented Image Feature Extraction for Detecting Diffusion-based DeepFake

Author: Yang, Di, Huang, Yihao, Guo, Qing, Juefei-Xu, Felix, Jia, Xiaojun, Wang, Run, Pu, Geguang, and Liu, Yang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The widespread use of diffusion methods enables the creation of highly realistic images on demand, thereby posing significant risks to the integrity and safety of online information and highlighting the necessity of DeepFake detection. Our analysis of features extracted by traditional image encoders reveals that both low-level and high-level features offer distinct advantages in identifying DeepFake images produced by various diffusion methods. Inspired by this finding, we aim to develop an effective representation that captures both low-level and high-level features to detect diffusion-based DeepFakes. To address the problem, we propose a text modality-oriented feature extraction method, termed TOFE. Specifically, for a given target image, the representation we discovered is a corresponding text embedding that can guide the generation of the target image with a specific text-to-image model. Experiments conducted across ten diffusion types demonstrate the efficacy of our proposed method.
Published: 2024

11. Semantic-guided Prompt Organization for Universal Goal Hijacking against LLMs

Author: Huang, Yihao, Wang, Chong, Jia, Xiaojun, Guo, Qing, Juefei-Xu, Felix, Zhang, Jian, Pu, Geguang, and Liu, Yang
Subjects: Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: With the rising popularity of Large Language Models (LLMs), assessing their trustworthiness through security tasks has gained critical importance. Regarding the new task of universal goal hijacking, previous efforts have concentrated solely on optimization algorithms, overlooking the crucial role of the prompt. To fill this gap, we propose a universal goal hijacking method called POUGH that incorporates semantic-guided prompt processing strategies. Specifically, the method starts with a sampling strategy to select representative prompts from a candidate pool, followed by a ranking strategy that prioritizes the prompts. Once the prompts are organized sequentially, the method employs an iterative optimization algorithm to generate the universal fixed suffix for the prompts. Experiments conducted on four popular LLMs and ten types of target responses verified the effectiveness of our method., Comment: 15 pages
Published: 2024

12. Identity Inference from CLIP Models using Only Textual Data

Author: Li, Songze, Cheng, Ruoxi, and Jia, Xiaojun
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security
Abstract: The widespread usage of large-scale multimodal models like CLIP has heightened concerns about the leakage of personally identifiable information (PII). Existing methods for identity inference in CLIP models, i.e., to detect the presence of a person's PII used for training a CLIP model, require querying the model with full PII, including textual descriptions of the person and corresponding images (e.g., the name and the face photo of the person). However, this may lead to potential privacy breach of the image, as it may have not been seen by the target model yet. Additionally, traditional membership inference attacks (MIAs) train shadow models to mimic the behaviors of the target model, which incurs high computational costs, especially for large CLIP models. To address these challenges, we propose a textual unimodal detector (TUNI) in CLIP models, a novel method for ID inference that 1) queries the target model with only text data; and 2) does not require training shadow models. Firstly, we develop a feature extraction algorithm, guided by the CLIP model, to extract features from a text description. TUNI starts with randomly generating textual gibberish that were clearly not utilized for training, and leverages their feature vectors to train a system of anomaly detectors. During inference, the feature vector of each test text is fed into the anomaly detectors to determine if the person's PII is in the training set (abnormal) or not (normal). Moreover, TUNI can be further strengthened integrating real images associated with the tested individuals, if available at the detector. Extensive experiments of TUNI across various CLIP model architectures and datasets demonstrate its superior performance over baselines, albeit with only text data.
Published: 2024

13. Semi-device-independent quantum random number generator with a broadband squeezed state of light

Author: Cheng, Jialin, Liang, Shaocong, Qin, Jiliang, Li, Jiatong, Yan, Zhihui, Jia, Xiaojun, Xie, Changde, and Peng, Kunchi
Subjects: Quantum Physics
Abstract: Random numbers are a basic ingredient of simulation algorithms and cryptography, and play a significant part in computer simulation and information processing. One prominent feature of a squeezed light is its lower fluctuation and more randomness in a pair
Published: 2024
Full Text: View/download PDF

14. Efficient Generation of Targeted and Transferable Adversarial Examples for Vision-Language Models Via Diffusion Models

Author: Guo, Qi, Pang, Shanmin, Jia, Xiaojun, Liu, Yang, and Guo, Qing
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Adversarial attacks, particularly \textbf{targeted} transfer-based attacks, can be used to assess the adversarial robustness of large visual-language models (VLMs), allowing for a more thorough examination of potential security flaws before deployment. However, previous transfer-based adversarial attacks incur high costs due to high iteration counts and complex method structure. Furthermore, due to the unnaturalness of adversarial semantics, the generated adversarial examples have low transferability. These issues limit the utility of existing methods for assessing robustness. To address these issues, we propose AdvDiffVLM, which uses diffusion models to generate natural, unrestricted and targeted adversarial examples via score matching. Specifically, AdvDiffVLM uses Adaptive Ensemble Gradient Estimation to modify the score during the diffusion model's reverse generation process, ensuring that the produced adversarial examples have natural adversarial targeted semantics, which improves their transferability. Simultaneously, to improve the quality of adversarial examples, we use the GradCAM-guided Mask method to disperse adversarial semantics throughout the image rather than concentrating them in a single area. Finally, AdvDiffVLM embeds more target semantics into adversarial examples after multiple iterations. Experimental results show that our method generates adversarial examples 5x to 10x faster than state-of-the-art transfer-based adversarial attacks while maintaining higher quality adversarial examples. Furthermore, compared to previous transfer-based adversarial attacks, the adversarial examples generated by our method have better transferability. Notably, AdvDiffVLM can successfully attack a variety of commercial VLMs in a black-box environment, including GPT-4V.
Published: 2024

15. High-speed quantum radio-frequency-over-light communication

Author: Liang, Shaocong, Cheng, Jialin, Qin, Jiliang, Li, Jiatong, Shi, Yi, Yan, Zhihui, Jia, Xiaojun, Xie, Changde, and Peng, Kunchi
Subjects: Quantum Physics
Abstract: Quantum dense coding (QDC) means to transmit two classical bits by only transferring one quantum bit, which has enabled high-capacity information transmission and strengthened system security. Continuousvariable QDC offers a promising solution to increase communication rates while achieving seamless integration with classical communication systems. Here, we propose and experimentally demonstrate a high-speed quantum radio-frequency-over-light (RFoL) communication scheme based on QDC with entangled state, and achieve a practical rate of 20 Mbps through digital modulation and RFoL communication. This scheme bridges the gap between quantum technology and real-world communication systems, which bring QDC closer to practical applications and offer prospects for further enhancement of metropolitan communication networks.
Published: 2024
Full Text: View/download PDF

16. Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory

Author: Gao, Sensen, Jia, Xiaojun, Ren, Xuhong, Tsang, Ivor, and Guo, Qing
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Vision-language pre-training (VLP) models exhibit remarkable capabilities in comprehending both images and text, yet they remain susceptible to multimodal adversarial examples (AEs). Strengthening attacks and uncovering vulnerabilities, especially common issues in VLP models (e.g., high transferable AEs), can advance reliable and practical VLP models. A recent work (i.e., Set-level guidance attack) indicates that augmenting image-text pairs to increase AE diversity along the optimization path enhances the transferability of adversarial examples significantly. However, this approach predominantly emphasizes diversity around the online adversarial examples (i.e., AEs in the optimization period), leading to the risk of overfitting the victim model and affecting the transferability. In this study, we posit that the diversity of adversarial examples towards the clean input and online AEs are both pivotal for enhancing transferability across VLP models. Consequently, we propose using diversification along the intersection region of adversarial trajectory to expand the diversity of AEs. To fully leverage the interaction between modalities, we introduce text-guided adversarial example selection during optimization. Furthermore, to further mitigate the potential overfitting, we direct the adversarial text deviating from the last intersection region along the optimization path, rather than adversarial images as in existing methods. Extensive experiments affirm the effectiveness of our method in improving transferability across various VLP models and downstream vision-and-language tasks., Comment: ECCV2024. Code is available at https://github.com/SensenGao/VLPTransferAttack
Published: 2024

17. Hide in Thicket: Generating Imperceptible and Rational Adversarial Perturbations on 3D Point Clouds

Author: Lou, Tianrui, Jia, Xiaojun, Gu, Jindong, Liu, Li, Liang, Siyuan, He, Bangyan, and Cao, Xiaochun
Subjects: Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Adversarial attack methods based on point manipulation for 3D point cloud classification have revealed the fragility of 3D models, yet the adversarial examples they produce are easily perceived or defended against. The trade-off between the imperceptibility and adversarial strength leads most point attack methods to inevitably introduce easily detectable outlier points upon a successful attack. Another promising strategy, shape-based attack, can effectively eliminate outliers, but existing methods often suffer significant reductions in imperceptibility due to irrational deformations. We find that concealing deformation perturbations in areas insensitive to human eyes can achieve a better trade-off between imperceptibility and adversarial strength, specifically in parts of the object surface that are complex and exhibit drastic curvature changes. Therefore, we propose a novel shape-based adversarial attack method, HiT-ADV, which initially conducts a two-stage search for attack regions based on saliency and imperceptibility scores, and then adds deformation perturbations in each attack region using Gaussian kernel functions. Additionally, HiT-ADV is extendable to physical attack. We propose that by employing benign resampling and benign rigid transformations, we can further enhance physical adversarial strength with little sacrifice to imperceptibility. Extensive experiments have validated the superiority of our method in terms of adversarial and imperceptible properties in both digital and physical spaces. Our code is avaliable at: https://github.com/TRLou/HiT-ADV., Comment: Accepted by CVPR 2024
Published: 2024

18. Poisoned Forgery Face: Towards Backdoor Attacks on Face Forgery Detection

Author: Liang, Jiawei, Liang, Siyuan, Liu, Aishan, Jia, Xiaojun, Kuang, Junhao, and Cao, Xiaochun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The proliferation of face forgery techniques has raised significant concerns within society, thereby motivating the development of face forgery detection methods. These methods aim to distinguish forged faces from genuine ones and have proven effective in practical applications. However, this paper introduces a novel and previously unrecognized threat in face forgery detection scenarios caused by backdoor attack. By embedding backdoors into models and incorporating specific trigger patterns into the input, attackers can deceive detectors into producing erroneous predictions for forged faces. To achieve this goal, this paper proposes \emph{Poisoned Forgery Face} framework, which enables clean-label backdoor attacks on face forgery detectors. Our approach involves constructing a scalable trigger generator and utilizing a novel convolving process to generate translation-sensitive trigger patterns. Moreover, we employ a relative embedding method based on landmark-based regions to enhance the stealthiness of the poisoned samples. Consequently, detectors trained on our poisoned samples are embedded with backdoors. Notably, our approach surpasses SoTA backdoor baselines with a significant improvement in attack success rate (+16.39\% BD-AUC) and reduction in visibility (-12.65\% $L_\infty$). Furthermore, our attack exhibits promising performance against backdoor defenses. We anticipate that this paper will draw greater attention to the potential threats posed by backdoor attacks in face forgery detection scenarios. Our codes will be made available at \url{https://github.com/JWLiang007/PFF}, Comment: ICLR 2024 Spotlight
Published: 2024

19. Improving Robustness of LiDAR-Camera Fusion Model against Weather Corruption from Fusion Strategy Perspective

Author: Huang, Yihao, Yu, Kaiyuan, Guo, Qing, Juefei-Xu, Felix, Jia, Xiaojun, Li, Tianlin, Pu, Geguang, and Liu, Yang
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: In recent years, LiDAR-camera fusion models have markedly advanced 3D object detection tasks in autonomous driving. However, their robustness against common weather corruption such as fog, rain, snow, and sunlight in the intricate physical world remains underexplored. In this paper, we evaluate the robustness of fusion models from the perspective of fusion strategies on the corrupted dataset. Based on the evaluation, we further propose a concise yet practical fusion strategy to enhance the robustness of the fusion models, namely flexibly weighted fusing features from LiDAR and camera sources to adapt to varying weather scenarios. Experiments conducted on four types of fusion models, each with two distinct lightweight implementations, confirm the broad applicability and effectiveness of the approach., Comment: 17 pages
Published: 2024

20. Cheating Suffix: Targeted Attack to Text-To-Image Diffusion Models with Multi-Modal Priors

Author: Yang, Dingcheng, Bai, Yang, Jia, Xiaojun, Liu, Yang, Cao, Xiaochun, and Yu, Wenjian
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security, Computer Science - Computer Vision and Pattern Recognition
Abstract: Diffusion models have been widely deployed in various image generation tasks, demonstrating an extraordinary connection between image and text modalities. However, they face challenges of being maliciously exploited to generate harmful or sensitive images by appending a specific suffix to the original prompt. Existing works mainly focus on using single-modal information to conduct attacks, which fails to utilize multi-modal features and results in less than satisfactory performance. Integrating multi-modal priors (MMP), i.e. both text and image features, we propose a targeted attack method named MMP-Attack in this work. Specifically, the goal of MMP-Attack is to add a target object into the image content while simultaneously removing the original object. The MMP-Attack shows a notable advantage over existing works with superior universality and transferability, which can effectively attack commercial text-to-image (T2I) models such as DALL-E 3. To the best of our knowledge, this marks the first successful attempt of transfer-based attack to commercial T2I models. Our code is publicly available at \url{https://github.com/ydc123/MMP-Attack}., Comment: 10 figures
Published: 2024

21. Does Few-shot Learning Suffer from Backdoor Attacks?

Author: Liu, Xinwei, Jia, Xiaojun, Gu, Jindong, Xun, Yuan, Liang, Siyuan, and Cao, Xiaochun
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence
Abstract: The field of few-shot learning (FSL) has shown promising results in scenarios where training data is limited, but its vulnerability to backdoor attacks remains largely unexplored. We first explore this topic by first evaluating the performance of the existing backdoor attack methods on few-shot learning scenarios. Unlike in standard supervised learning, existing backdoor attack methods failed to perform an effective attack in FSL due to two main issues. Firstly, the model tends to overfit to either benign features or trigger features, causing a tough trade-off between attack success rate and benign accuracy. Secondly, due to the small number of training samples, the dirty label or visible trigger in the support set can be easily detected by victims, which reduces the stealthiness of attacks. It seemed that FSL could survive from backdoor attacks. However, in this paper, we propose the Few-shot Learning Backdoor Attack (FLBA) to show that FSL can still be vulnerable to backdoor attacks. Specifically, we first generate a trigger to maximize the gap between poisoned and benign features. It enables the model to learn both benign and trigger features, which solves the problem of overfitting. To make it more stealthy, we hide the trigger by optimizing two types of imperceptible perturbation, namely attractive and repulsive perturbation, instead of attaching the trigger directly. Once we obtain the perturbations, we can poison all samples in the benign support set into a hidden poisoned support set and fine-tune the model on it. Our method demonstrates a high Attack Success Rate (ASR) in FSL tasks with different few-shot learning paradigms while preserving clean accuracy and maintaining stealthiness. This study reveals that few-shot learning still suffers from backdoor attacks, and its security should be given attention., Comment: AAAI2024
Published: 2023

22. JailGuard: A Universal Detection Framework for LLM Prompt-based Attacks

Author: Zhang, Xiaoyu, Zhang, Cen, Li, Tianlin, Huang, Yihao, Jia, Xiaojun, Hu, Ming, Zhang, Jie, Liu, Yang, Ma, Shiqing, and Shen, Chao
Subjects: Computer Science - Cryptography and Security
Abstract: Large Language Models (LLMs) and Multi-Modal LLMs (MLLMs) have played a critical role in numerous applications. However, current LLMs are vulnerable to prompt-based attacks, with jailbreaking attacks enabling LLMs to generate harmful content, while hijacking attacks manipulate the model to perform unintended tasks, underscoring the necessity for detection methods. Unfortunately, existing detecting approaches are usually tailored to specific attacks, resulting in poor generalization in detecting various attacks across different modalities. To address it, we propose JailGuard, a universal detection framework for jailbreaking and hijacking attacks across LLMs and MLLMs. JailGuard operates on the principle that attacks are inherently less robust than benign ones, regardless of method or modality. Specifically, JailGuard mutates untrusted inputs to generate variants and leverages the discrepancy of the variants' responses on the model to distinguish attack samples from benign samples. We implement 18 mutators for text and image inputs and design a mutator combination policy to further improve detection generalization. To evaluate the effectiveness of JailGuard, we build the first comprehensive multi-modal attack dataset, containing 11,000 data items across 15 known attack types. The evaluation suggests that JailGuard achieves the best detection accuracy of 86.14%/82.90% on text and image inputs, outperforming state-of-the-art methods by 11.81%-25.73% and 12.20%-21.40%., Comment: 28 pages, 9 figures
Published: 2023

23. SA-Attack: Improving Adversarial Transferability of Vision-Language Pre-training Models via Self-Augmentation

Author: He, Bangyan, Jia, Xiaojun, Liang, Siyuan, Lou, Tianrui, Liu, Yang, and Cao, Xiaochun
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Cryptography and Security, Computer Science - Machine Learning
Abstract: Current Visual-Language Pre-training (VLP) models are vulnerable to adversarial examples. These adversarial examples present substantial security risks to VLP models, as they can leverage inherent weaknesses in the models, resulting in incorrect predictions. In contrast to white-box adversarial attacks, transfer attacks (where the adversary crafts adversarial examples on a white-box model to fool another black-box model) are more reflective of real-world scenarios, thus making them more meaningful for research. By summarizing and analyzing existing research, we identified two factors that can influence the efficacy of transfer attacks on VLP models: inter-modal interaction and data diversity. Based on these insights, we propose a self-augment-based transfer attack method, termed SA-Attack. Specifically, during the generation of adversarial images and adversarial texts, we apply different data augmentation methods to the image modality and text modality, respectively, with the aim of improving the adversarial transferability of the generated adversarial images and texts. Experiments conducted on the FLickr30K and COCO datasets have validated the effectiveness of our method. Our code will be available after this paper is accepted.
Published: 2023

24. OT-Attack: Enhancing Adversarial Transferability of Vision-Language Models via Optimal Transport Optimization

Author: Han, Dongchen, Jia, Xiaojun, Bai, Yang, Gu, Jindong, Liu, Yang, and Cao, Xiaochun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Vision-language pre-training (VLP) models demonstrate impressive abilities in processing both images and text. However, they are vulnerable to multi-modal adversarial examples (AEs). Investigating the generation of high-transferability adversarial examples is crucial for uncovering VLP models' vulnerabilities in practical scenarios. Recent works have indicated that leveraging data augmentation and image-text modal interactions can enhance the transferability of adversarial examples for VLP models significantly. However, they do not consider the optimal alignment problem between dataaugmented image-text pairs. This oversight leads to adversarial examples that are overly tailored to the source model, thus limiting improvements in transferability. In our research, we first explore the interplay between image sets produced through data augmentation and their corresponding text sets. We find that augmented image samples can align optimally with certain texts while exhibiting less relevance to others. Motivated by this, we propose an Optimal Transport-based Adversarial Attack, dubbed OT-Attack. The proposed method formulates the features of image and text sets as two distinct distributions and employs optimal transport theory to determine the most efficient mapping between them. This optimal mapping informs our generation of adversarial examples to effectively counteract the overfitting issues. Extensive experiments across various network architectures and datasets in image-text matching tasks reveal that our OT-Attack outperforms existing state-of-the-art methods in terms of adversarial transferability.
Published: 2023

25. TranSegPGD: Improving Transferability of Adversarial Examples on Semantic Segmentation

Author: Jia, Xiaojun, Gu, Jindong, Huang, Yihao, Qin, Simeng, Guo, Qing, Liu, Yang, and Cao, Xiaochun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Transferability of adversarial examples on image classification has been systematically explored, which generates adversarial examples in black-box mode. However, the transferability of adversarial examples on semantic segmentation has been largely overlooked. In this paper, we propose an effective two-stage adversarial attack strategy to improve the transferability of adversarial examples on semantic segmentation, dubbed TranSegPGD. Specifically, at the first stage, every pixel in an input image is divided into different branches based on its adversarial property. Different branches are assigned different weights for optimization to improve the adversarial performance of all pixels.We assign high weights to the loss of the hard-to-attack pixels to misclassify all pixels. At the second stage, the pixels are divided into different branches based on their transferable property which is dependent on Kullback-Leibler divergence. Different branches are assigned different weights for optimization to improve the transferability of the adversarial examples. We assign high weights to the loss of the high-transferability pixels to improve the transferability of adversarial examples. Extensive experiments with various segmentation models are conducted on PASCAL VOC 2012 and Cityscapes datasets to demonstrate the effectiveness of the proposed method. The proposed adversarial attack method can achieve state-of-the-art performance.
Published: 2023

26. Boosting Transferability in Vision-Language Attacks via Diversification Along the Intersection Region of Adversarial Trajectory

Author: Gao, Sensen, Jia, Xiaojun, Ren, Xuhong, Tsang, Ivor, Guo, Qing, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
Published: 2025
Full Text: View/download PDF

27. A Survey on Transferability of Adversarial Examples across Deep Neural Networks

Author: Gu, Jindong, Jia, Xiaojun, de Jorge, Pau, Yu, Wenqain, Liu, Xinwei, Ma, Avery, Xun, Yuan, Hu, Anjun, Khakzar, Ashkan, Li, Zhijiang, Cao, Xiaochun, and Torr, Philip
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The emergence of Deep Neural Networks (DNNs) has revolutionized various domains by enabling the resolution of complex tasks spanning image recognition, natural language processing, and scientific problem-solving. However, this progress has also brought to light a concerning vulnerability: adversarial examples. These crafted inputs, imperceptible to humans, can manipulate machine learning models into making erroneous predictions, raising concerns for safety-critical applications. An intriguing property of this phenomenon is the transferability of adversarial examples, where perturbations crafted for one model can deceive another, often with a different architecture. This intriguing property enables black-box attacks which circumvents the need for detailed knowledge of the target model. This survey explores the landscape of the adversarial transferability of adversarial examples. We categorize existing methodologies to enhance adversarial transferability and discuss the fundamental principles guiding each approach. While the predominant body of research primarily concentrates on image classification, we also extend our discussion to encompass other vision tasks and beyond. Challenges and opportunities are discussed, highlighting the importance of fortifying DNNs against adversarial vulnerabilities in an evolving landscape., Comment: Accepted to Transactions on Machine Learning Research (TMLR)
Published: 2023

28. Fast Propagation is Better: Accelerating Single-Step Adversarial Training via Sampling Subnetworks

Author: Jia, Xiaojun, Li, Jianshu, Gu, Jindong, Bai, Yang, and Cao, Xiaochun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Adversarial training has shown promise in building robust models against adversarial examples. A major drawback of adversarial training is the computational overhead introduced by the generation of adversarial examples. To overcome this limitation, adversarial training based on single-step attacks has been explored. Previous work improves the single-step adversarial training from different perspectives, e.g., sample initialization, loss regularization, and training strategy. Almost all of them treat the underlying model as a black box. In this work, we propose to exploit the interior building blocks of the model to improve efficiency. Specifically, we propose to dynamically sample lightweight subnetworks as a surrogate model during training. By doing this, both the forward and backward passes can be accelerated for efficient adversarial training. Besides, we provide theoretical analysis to show the model robustness can be improved by the single-step adversarial training with sampled subnetworks. Furthermore, we propose a novel sampling strategy where the sampling varies from layer to layer and from iteration to iteration. Compared with previous methods, our method not only reduces the training cost but also achieves better model robustness. Evaluations on a series of popular datasets demonstrate the effectiveness of the proposed FB-Better. Our code has been released at https://github.com/jiaxiaojunQAQ/FP-Better.
Published: 2023

29. Revisiting and Exploring Efficient Fast Adversarial Training via LAW: Lipschitz Regularization and Auto Weight Averaging

Author: Jia, Xiaojun, Chen, Yuefeng, Mao, Xiaofeng, Duan, Ranjie, Gu, Jindong, Zhang, Rong, Xue, Hui, and Cao, Xiaochun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Fast Adversarial Training (FAT) not only improves the model robustness but also reduces the training cost of standard adversarial training. However, fast adversarial training often suffers from Catastrophic Overfitting (CO), which results in poor robustness performance. Catastrophic Overfitting describes the phenomenon of a sudden and significant decrease in robust accuracy during the training of fast adversarial training. Many effective techniques have been developed to prevent Catastrophic Overfitting and improve the model robustness from different perspectives. However, these techniques adopt inconsistent training settings and require different training costs, i.e, training time and memory costs, leading to unfair comparisons. In this paper, we conduct a comprehensive study of over 10 fast adversarial training methods in terms of adversarial robustness and training costs. We revisit the effectiveness and efficiency of fast adversarial training techniques in preventing Catastrophic Overfitting from the perspective of model local nonlinearity and propose an effective Lipschitz regularization method for fast adversarial training. Furthermore, we explore the effect of data augmentation and weight averaging in fast adversarial training and propose a simple yet effective auto weight averaging method to improve robustness further. By assembling these techniques, we propose a FGSM-based fast adversarial training method equipped with Lipschitz regularization and Auto Weight averaging, abbreviated as FGSM-LAW. Experimental evaluations on four benchmark databases demonstrate the superiority of the proposed method over state-of-the-art fast adversarial training methods and the advanced standard adversarial training methods.
Published: 2023

30. Robust Automatic Speech Recognition via WavAugment Guided Phoneme Adversarial Training

Author: Qi, Gege, Chen, Yuefeng, Mao, Xiaofeng, Jia, Xiaojun, Duan, Ranjie, Zhang, Rong, and Xue, Hui
Subjects: Computer Science - Sound, Computer Science - Computation and Language, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Developing a practically-robust automatic speech recognition (ASR) is challenging since the model should not only maintain the original performance on clean samples, but also achieve consistent efficacy under small volume perturbations and large domain shifts. To address this problem, we propose a novel WavAugment Guided Phoneme Adversarial Training (wapat). wapat use adversarial examples in phoneme space as augmentation to make the model invariant to minor fluctuations in phoneme representation and preserve the performance on clean samples. In addition, wapat utilizes the phoneme representation of augmented samples to guide the generation of adversaries, which helps to find more stable and diverse gradient-directions, resulting in improved generalization. Extensive experiments demonstrate the effectiveness of wapat on End-to-end Speech Challenge Benchmark (ESB). Notably, SpeechLM-wapat outperforms the original model by 6.28% WER reduction on ESB, achieving the new state-of-the-art.
Published: 2023

31. Context-Aware Robust Fine-Tuning

Author: Mao, Xiaofeng, Chen, Yufeng, Jia, Xiaojun, Zhang, Rong, Xue, Hui, and Li, Zhao
Published: 2024
Full Text: View/download PDF

32. Improving Fast Adversarial Training with Prior-Guided Knowledge

Author: Jia, Xiaojun, Zhang, Yong, Wei, Xingxing, Wu, Baoyuan, Ma, Ke, Wang, Jue, and Cao, Xiaochun
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: Fast adversarial training (FAT) is an efficient method to improve robustness. However, the original FAT suffers from catastrophic overfitting, which dramatically and suddenly reduces robustness after a few training epochs. Although various FAT variants have been proposed to prevent overfitting, they require high training costs. In this paper, we investigate the relationship between adversarial example quality and catastrophic overfitting by comparing the training processes of standard adversarial training and FAT. We find that catastrophic overfitting occurs when the attack success rate of adversarial examples becomes worse. Based on this observation, we propose a positive prior-guided adversarial initialization to prevent overfitting by improving adversarial example quality without extra training costs. This initialization is generated by using high-quality adversarial perturbations from the historical training process. We provide theoretical analysis for the proposed initialization and propose a prior-guided regularization method that boosts the smoothness of the loss function. Additionally, we design a prior-guided ensemble FAT method that averages the different model weights of historical models using different decay rates. Our proposed method, called FGSM-PGK, assembles the prior-guided knowledge, i.e., the prior-guided initialization and model weights, acquired during the historical training process. Evaluations of four datasets demonstrate the superiority of the proposed method.
Published: 2023

33. Research on the Influence of Financial Development on Industrial Structure Upgrading

Author: Li, Nan, Jia, Xiaojun, Barbosa-Povoa, Ana Paula, Editorial Board Member, de Almeida, Adiel Teixeira, Editorial Board Member, Gans, Noah, Editorial Board Member, Gupta, Jatinder N. D., Editorial Board Member, Heim, Gregory R., Editorial Board Member, Hua, Guowei, Editorial Board Member, Kimms, Alf, Editorial Board Member, Li, Xiang, Editorial Board Member, Masri, Hatem, Editorial Board Member, Nickel, Stefan, Editorial Board Member, Qiu, Robin, Editorial Board Member, Shankar, Ravi, Editorial Board Member, Slowiński, Roman, Editorial Board Member, Tang, Christopher S., Editorial Board Member, Wu, Yuzhe, Editorial Board Member, Zhu, Joe, Editorial Board Member, Zopounidis, Constantin, Editorial Board Member, Li, Menggang, editor, Guowei, Hua, editor, Huang, Anqiang, editor, Fu, Xiaowen, editor, and Chang, Dan, editor
Published: 2024
Full Text: View/download PDF

34. Context-Aware Robust Fine-Tuning

Author: Mao, Xiaofeng, Chen, Yuefeng, Jia, Xiaojun, Zhang, Rong, Xue, Hui, and Li, Zhao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Contrastive Language-Image Pre-trained (CLIP) models have zero-shot ability of classifying an image belonging to "[CLASS]" by using similarity between the image and the prompt sentence "a [CONTEXT] of [CLASS]". Based on exhaustive text cues in "[CONTEXT]", CLIP model is aware of different contexts, e.g. background, style, viewpoint, and exhibits unprecedented robustness against a wide range of distribution shifts. However, recent works find further fine-tuning of CLIP models improves accuracy but sacrifices the robustness on downstream tasks. We conduct an empirical investigation to show fine-tuning will corrupt the context-aware ability of pre-trained CLIP features. To solve this problem, we propose Context-Aware Robust Fine-tuning (CAR-FT). CAR-FT regularizes the model during fine-tuning to capture the context information. Specifically, we use zero-shot prompt weights to get the context distribution contained in the image. By minimizing the Kullback-Leibler Divergence (KLD) between context distributions induced by original/fine-tuned CLIP models, CAR-FT makes the context-aware ability of CLIP inherited into downstream tasks, and achieves both higher In-Distribution (ID) and Out-Of-Distribution (OOD) accuracy. The experimental results show CAR-FT achieves superior robustness on five OOD test datasets of ImageNet, and meanwhile brings accuracy gains on nine downstream tasks. Additionally, CAR-FT surpasses previous Domain Generalization (DG) methods and gets 78.5% averaged accuracy on DomainBed benchmark, building the new state-of-the-art.
Published: 2022

35. A Large-scale Multiple-objective Method for Black-box Attack against Object Detection

Author: Liang, Siyuan, Li, Longkang, Fan, Yanbo, Jia, Xiaojun, Li, Jingzhi, Wu, Baoyuan, and Cao, Xiaochun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Recent studies have shown that detectors based on deep models are vulnerable to adversarial examples, even in the black-box scenario where the attacker cannot access the model information. Most existing attack methods aim to minimize the true positive rate, which often shows poor attack performance, as another sub-optimal bounding box may be detected around the attacked bounding box to be the new true positive one. To settle this challenge, we propose to minimize the true positive rate and maximize the false positive rate, which can encourage more false positive objects to block the generation of new true positive bounding boxes. It is modeled as a multi-objective optimization (MOP) problem, of which the generic algorithm can search the Pareto-optimal. However, our task has more than two million decision variables, leading to low searching efficiency. Thus, we extend the standard Genetic Algorithm with Random Subset selection and Divide-and-Conquer, called GARSDC, which significantly improves the efficiency. Moreover, to alleviate the sensitivity to population quality in generic algorithms, we generate a gradient-prior initial population, utilizing the transferability between different detectors with similar backbones. Compared with the state-of-art attack methods, GARSDC decreases by an average 12.0 in the mAP and queries by about 1000 times in extensive experiments. Our codes can be found at https://github.com/LiangSiyuan21/ GARSDC., Comment: 14 pages, 5 figures, ECCV2022
Published: 2022

36. MOVE: Effective and Harmless Ownership Verification via Embedded External Features

Author: Li, Yiming, Zhu, Linghui, Jia, Xiaojun, Bai, Yang, Jiang, Yong, Xia, Shu-Tao, and Cao, Xiaochun
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Currently, deep neural networks (DNNs) are widely adopted in different applications. Despite its commercial values, training a well-performed DNN is resource-consuming. Accordingly, the well-trained model is valuable intellectual property for its owner. However, recent studies revealed the threats of model stealing, where the adversaries can obtain a function-similar copy of the victim model, even when they can only query the model. In this paper, we propose an effective and harmless model ownership verification (MOVE) to defend against different types of model stealing simultaneously, without introducing new security risks. In general, we conduct the ownership verification by verifying whether a suspicious model contains the knowledge of defender-specified external features. Specifically, we embed the external features by tempering a few training samples with style transfer. We then train a meta-classifier to determine whether a model is stolen from the victim. This approach is inspired by the understanding that the stolen models should contain the knowledge of features learned by the victim model. In particular, we develop our MOVE method under both white-box and black-box settings to provide comprehensive model protection. Extensive experiments on benchmark datasets verify the effectiveness of our method and its resistance to potential adaptive attacks. The codes for reproducing the main experiments of our method are available at \url{https://github.com/THUYimingLi/MOVE}., Comment: 15 pages. The journal extension of our conference paper in AAAI 2022 (https://ojs.aaai.org/index.php/AAAI/article/view/20036). arXiv admin note: substantial text overlap with arXiv:2112.03476
Published: 2022

37. Prior-Guided Adversarial Initialization for Fast Adversarial Training

Author: Jia, Xiaojun, Zhang, Yong, Wei, Xingxing, Wu, Baoyuan, Ma, Ke, Wang, Jue, and Cao, Xiaochun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Fast adversarial training (FAT) effectively improves the efficiency of standard adversarial training (SAT). However, initial FAT encounters catastrophic overfitting, i.e.,the robust accuracy against adversarial attacks suddenly and dramatically decreases. Though several FAT variants spare no effort to prevent overfitting, they sacrifice much calculation cost. In this paper, we explore the difference between the training processes of SAT and FAT and observe that the attack success rate of adversarial examples (AEs) of FAT gets worse gradually in the late training stage, resulting in overfitting. The AEs are generated by the fast gradient sign method (FGSM) with a zero or random initialization. Based on the observation, we propose a prior-guided FGSM initialization method to avoid overfitting after investigating several initialization strategies, improving the quality of the AEs during the whole training process. The initialization is formed by leveraging historically generated AEs without additional calculation cost. We further provide a theoretical analysis for the proposed initialization method. We also propose a simple yet effective regularizer based on the prior-guided initialization,i.e., the currently generated perturbation should not deviate too much from the prior-guided initialization. The regularizer adopts both historical and current adversarial perturbations to guide the model learning. Evaluations on four datasets demonstrate that the proposed method can prevent catastrophic overfitting and outperform state-of-the-art FAT methods. The code is released at https://github.com/jiaxiaojunQAQ/FGSM-PGI., Comment: ECCV 2022
Published: 2022

38. Watermark Vaccine: Adversarial Attacks to Prevent Watermark Removal

Author: Liu, Xinwei, Liu, Jian, Bai, Yang, Gu, Jindong, Chen, Tao, Jia, Xiaojun, and Cao, Xiaochun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: As a common security tool, visible watermarking has been widely applied to protect copyrights of digital images. However, recent works have shown that visible watermarks can be removed by DNNs without damaging their host images. Such watermark-removal techniques pose a great threat to the ownership of images. Inspired by the vulnerability of DNNs on adversarial perturbations, we propose a novel defence mechanism by adversarial machine learning for good. From the perspective of the adversary, blind watermark-removal networks can be posed as our target models; then we actually optimize an imperceptible adversarial perturbation on the host images to proactively attack against watermark-removal networks, dubbed Watermark Vaccine. Specifically, two types of vaccines are proposed. Disrupting Watermark Vaccine (DWV) induces to ruin the host image along with watermark after passing through watermark-removal networks. In contrast, Inerasable Watermark Vaccine (IWV) works in another fashion of trying to keep the watermark not removed and still noticeable. Extensive experiments demonstrate the effectiveness of our DWV/IWV in preventing watermark removal, especially on various watermark removal networks., Comment: ECCV 2022
Published: 2022

39. High-performance cavity-enhanced quantum memory with warm atomic cell

Author: Ma, Lixia, Lei, Xing, Yan, Jieli, Li, Ruiyang, Chai, Ting, Yan, Zhihui, Jia, Xiaojun, Xie, Changde, and Peng, Kunchi
Subjects: Quantum Physics
Abstract: High-performance quantum memory for quantized states of light is a prerequisite building block of quantum information technology. Despite great progresses of optical quantum memories based on interactions of light and atoms, physical features of these memories still cannot satisfy requirements for applications in practical quantum information systems, since all of them suffer from trade-off between memory efficiency and excess noise. Here, we report a high-performance cavity-enhanced electromagnetically-induced-transparency memory with warm atomic cell in which a scheme of optimizing the spatial and temporal modes based on the time-reversal approach is applied. The memory efficiency up to 67% is directly measured and a noise level close to quantum noise limit is simultaneously reached. It has been experimentally demonstrated that the average fidelities for a set of input coherent states with different phases and amplitudes within a Gaussian distribution have exceeded the classical benchmark fidelities. Thus the realized quantum memory platform has been capable of preserving quantized optical states, and is ready to be applied in quantum information systems, such as distributed quantum logic gates and quantum-enhanced atomic magnetometry., Comment: 15 pages,2 figures
Published: 2022
Full Text: View/download PDF

40. LAS-AT: Adversarial Training with Learnable Attack Strategy

Author: Jia, Xiaojun, Zhang, Yong, Wu, Baoyuan, Ma, Ke, Wang, Jue, and Cao, Xiaochun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Adversarial training (AT) is always formulated as a minimax problem, of which the performance depends on the inner optimization that involves the generation of adversarial examples (AEs). Most previous methods adopt Projected Gradient Decent (PGD) with manually specifying attack parameters for AE generation. A combination of the attack parameters can be referred to as an attack strategy. Several works have revealed that using a fixed attack strategy to generate AEs during the whole training phase limits the model robustness and propose to exploit different attack strategies at different training stages to improve robustness. But those multi-stage hand-crafted attack strategies need much domain expertise, and the robustness improvement is limited. In this paper, we propose a novel framework for adversarial training by introducing the concept of "learnable attack strategy", dubbed LAS-AT, which learns to automatically produce attack strategies to improve the model robustness. Our framework is composed of a target network that uses AEs for training to improve robustness and a strategy network that produces attack strategies to control the AE generation. Experimental evaluations on three benchmark databases demonstrate the superiority of the proposed method. The code is released at https://github.com/jiaxiaojunQAQ/LAS-AT.
Published: 2022

41. Defending against Model Stealing via Verifying Embedded External Features

Author: Li, Yiming, Zhu, Linghui, Jia, Xiaojun, Jiang, Yong, Xia, Shu-Tao, and Cao, Xiaochun
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Obtaining a well-trained model involves expensive data collection and training procedures, therefore the model is a valuable intellectual property. Recent studies revealed that adversaries can `steal' deployed models even when they have no training samples and can not get access to the model parameters or structures. Currently, there were some defense methods to alleviate this threat, mostly by increasing the cost of model stealing. In this paper, we explore the defense from another angle by verifying whether a suspicious model contains the knowledge of defender-specified \emph{external features}. Specifically, we embed the external features by tempering a few training samples with style transfer. We then train a meta-classifier to determine whether a model is stolen from the victim. This approach is inspired by the understanding that the stolen models should contain the knowledge of features learned by the victim model. We examine our method on both CIFAR-10 and ImageNet datasets. Experimental results demonstrate that our method is effective in detecting different types of model stealing simultaneously, even if the stolen model is obtained via a multi-stage stealing process. The codes for reproducing main results are available at Github (https://github.com/zlh-thu/StealingVerification)., Comment: This work is accepted by the AAAI 2022. The first two authors contributed equally to this work. 11 pages
Published: 2021

42. Lignocellulose degradation and temperature adaptation mechanisms during composting of mushroom residue and wood chips at low temperature with inoculation of psychrotolerant microbial agent

Author: Tian, Xueping, Qin, Xiaomeng, Jia, Xiaojun, Lyu, Qingyang, Li, Siqi, Jiang, Linwei, Chen, Lin, Yan, Zhiying, and Huang, Jun
Published: 2024
Full Text: View/download PDF

43. Boosting Fast Adversarial Training with Learnable Adversarial Initialization

Author: Jia, Xiaojun, Zhang, Yong, Wu, Baoyuan, Wang, Jue, and Cao, Xiaochun
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Adversarial training (AT) has been demonstrated to be effective in improving model robustness by leveraging adversarial examples for training. However, most AT methods are in face of expensive time and computational cost for calculating gradients at multiple steps in generating adversarial examples. To boost training efficiency, fast gradient sign method (FGSM) is adopted in fast AT methods by calculating gradient only once. Unfortunately, the robustness is far from satisfactory. One reason may arise from the initialization fashion. Existing fast AT generally uses a random sample-agnostic initialization, which facilitates the efficiency yet hinders a further robustness improvement. Up to now, the initialization in fast AT is still not extensively explored. In this paper, we boost fast AT with a sample-dependent adversarial initialization, i.e., an output from a generative network conditioned on a benign image and its gradient information from the target network. As the generative network and the target network are optimized jointly in the training phase, the former can adaptively generate an effective initialization with respect to the latter, which motivates gradually improved robustness. Experimental evaluations on four benchmark databases demonstrate the superiority of our proposed method over state-of-the-art fast AT methods, as well as comparable robustness to advanced multi-step AT methods. The code is released at https://github.com//jiaxiaojunQAQ//FGSM-SDI., Comment: Accepted by TIP
Published: 2021
Full Text: View/download PDF

44. Teleportation goes to Hertz rate

Author: Yan, Zhihui and Jia, Xiaojun
Published: 2023
Full Text: View/download PDF

45. An Effective and Robust Detector for Logo Detection

Author: Jia, Xiaojun, Yan, Huanqian, Wu, Yonglin, Wei, Xingxing, Cao, Xiaochun, and Zhang, Yong
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In recent years, intellectual property (IP), which represents literary, inventions, artistic works, etc, gradually attract more and more people's attention. Particularly, with the rise of e-commerce, the IP not only represents the product design and brands, but also represents the images/videos displayed on e-commerce platforms. Unfortunately, some attackers adopt some adversarial methods to fool the well-trained logo detection model for infringement. To overcome this problem, a novel logo detector based on the mechanism of looking and thinking twice is proposed in this paper for robust logo detection. The proposed detector is different from other mainstream detectors, which can effectively detect small objects, long-tail objects, and is robust to adversarial images. In detail, we extend detectoRS algorithm to a cascade schema with an equalization loss function, multi-scale transformations, and adversarial data augmentation. A series of experimental results have shown that the proposed method can effectively improve the robustness of the detection model. Moreover, we have applied the proposed methods to competition ACM MM2021 Robust Logo Detection that is organized by Alibaba on the Tianchi platform and won top 2 in 36489 teams. Code is available at https://github.com/jiaxiaojunQAQ/Robust-Logo-Detection., Comment: 5 pages
Published: 2021

46. HSG-MGAF Net: Heterogeneous subgraph-guided multiscale graph attention fusion network for interpretable prediction of whole-slide image

Author: Liang, Meiyan, Jiang, Xing, Cao, Jie, Zhang, Shupeng, Liu, Haishun, Li, Bo, Wang, Lin, Zhang, Cunlin, and Jia, Xiaojun
Published: 2024
Full Text: View/download PDF

47. A robust magnetic composite catalyst derived from waste ore and biomass for efficient degradation of tetracycline by activated peroxymonosulfate

Author: Wang, Jingqi, Huang, Na, Wang, Guoliang, Yu, Jingwen, Wang, Fei, Zhang, Dongnian, Su, Feng, Jia, Xiaojun, Wang, Mengmeng, Meng, Xianbin, Kong, Chuncai, Yang, Zhimao, Wang, Tong, and Zhu, Hao
Published: 2024
Full Text: View/download PDF

48. Establishment of Evaluation System to Ensure the Safety of Environmental Industry in China Under the New Development Pattern

Author: Gao, Weizhen, Li, Menggang, Jia, Xiaojun, Zhao, Yuejiao, Zou, Chengzhang, Barbosa-Povoa, Ana Paula, Editorial Board Member, de Almeida, Adiel Teixeira, Editorial Board Member, Gans, Noah, Editorial Board Member, Gupta, Jatinder N. D., Editorial Board Member, Heim, Gregory R., Editorial Board Member, Hua, Guowei, Editorial Board Member, Kimms, Alf, Editorial Board Member, Li, Xiang, Editorial Board Member, Masri, Hatem, Editorial Board Member, Nickel, Stefan, Editorial Board Member, Qiu, Robin, Editorial Board Member, Shankar, Ravi, Editorial Board Member, Slowiński, Roman, Editorial Board Member, Tang, Christopher S., Editorial Board Member, Wu, Yuzhe, Editorial Board Member, Zhu, Joe, Editorial Board Member, Zopounidis, Constantin, Editorial Board Member, Li, Menggang, editor, Fu, Xiaowen, editor, Huang, Anqiang, editor, and Chang, Dan, editor
Published: 2023
Full Text: View/download PDF

49. The Impact of Digitalization on Teleworking Communication and Trust in the Post Epidemic Era

Author: Zhao, Yuejiao, Li, Menggang, Jia, Xiaojun, Gao, Weizhen, Zheng, Li, Barbosa-Povoa, Ana Paula, Editorial Board Member, de Almeida, Adiel Teixeira, Editorial Board Member, Gans, Noah, Editorial Board Member, Gupta, Jatinder N. D., Editorial Board Member, Heim, Gregory R., Editorial Board Member, Hua, Guowei, Editorial Board Member, Kimms, Alf, Editorial Board Member, Li, Xiang, Editorial Board Member, Masri, Hatem, Editorial Board Member, Nickel, Stefan, Editorial Board Member, Qiu, Robin, Editorial Board Member, Shankar, Ravi, Editorial Board Member, Slowiński, Roman, Editorial Board Member, Tang, Christopher S., Editorial Board Member, Wu, Yuzhe, Editorial Board Member, Zhu, Joe, Editorial Board Member, Zopounidis, Constantin, Editorial Board Member, Li, Menggang, editor, Fu, Xiaowen, editor, Huang, Anqiang, editor, and Chang, Dan, editor
Published: 2023
Full Text: View/download PDF

50. Exploring Graphic Design and its Interactivity in the Context of Cognitive Psychology

Author: Jia Xiaojun and Mao Yi
Subjects: bap analysis, group sparse decomposition, interaction genetic method, cognitive psychology, erp, graphic design, 00a73, Mathematics, QA1-939
Abstract: Cognitive psychology can help better understand human thinking and behavior, and its application in design can increase the infectious force of design. In this paper, because the brain will generate ERP signals when cognitive stimuli stimulate it, the group sparse decomposition method obtains ERP from EEG signals, and uses the BAP of brain regions to construct the interaction activity pattern of brain regions, and then combines with the interaction genetic algorithm to search for the optimal solution for the data in graphic design by using hierarchical degradation method for processing. The EEG data and interaction behavior characteristics of users were explored through chromatic stimulus response experiments and interaction behavior analysis. It was found that the accuracy of users’ responses to orange, green and purple were all higher than 60%, and P400 in EEG responded to the stimuli most significantly, with P-values of 0.0291, 0.0462, and 0.0355 at P5, P6, and Pz, respectively, with significant effects. The average number of interactions of 46 users in the graphic design enjoyment was 31.47, with the highest success rate of interaction reaching 99.81%, and the average duration of interaction experience for all subjects was 10.26 minutes. Graphic design based on cognitive psychology can capture people’s inner emotions and make better design solutions.
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

488 results on '"Jia, Xiaojun"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources