Back to Search Start Over

A lightweight unsupervised adversarial detector based on autoencoder and isolation forest.

Authors :
Liu, Hui
Zhao, Bo
Guo, Jiabao
Zhang, Kehuan
Liu, Peng
Source :
Pattern Recognition. Mar2024, Vol. 147, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

Although deep neural networks (DNNs) have performed well on many perceptual tasks, they are vulnerable to adversarial examples that are generated by adding slight but maliciously crafted perturbations to benign images. Adversarial detection is an important technique for identifying adversarial examples before they are entered into target DNNs. Previous studies that were performed to detect adversarial examples either targeted specific attacks or required expensive computation. Designing a lightweight unsupervised detector is still a challenging problem. In this paper, we propose an A uto E ncoder-based A dversarial E xamples (AEAE) detector that can guard DNN models by detecting adversarial examples with low computation in an unsupervised manner. The AEAE includes only a shallow autoencoder that performs two roles. First, a well-trained autoencoder has learned the manifold of benign examples. This autoencoder can produce a large reconstruction error for adversarial images with large perturbations, so we can detect significantly perturbed adversarial examples based on the reconstruction error. Second, the autoencoder can filter out small noises and change the DNN's prediction on adversarial examples with small perturbations. It helps to detect slightly perturbed adversarial examples based on the prediction distance. To cover these two cases, we utilize the reconstruction error and prediction distance from benign images to construct a two-tuple feature set and train an adversarial detector using the isolation forest algorithm. We show empirically that AEAE is an unsupervised and inexpensive detector against most state-of-the-art attacks. Through the detection in these two cases, there is nowhere to hide adversarial examples. • We observe that adversarial detection is sensitive to the perturbation level. • We train a shallow autoencoder to find two key features from adversarial examples. • We propose a lightweight and unsupervised adversarial detector. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00313203
Volume :
147
Database :
Academic Search Index
Journal :
Pattern Recognition
Publication Type :
Academic Journal
Accession number :
173976416
Full Text :
https://doi.org/10.1016/j.patcog.2023.110127