Adversarial Examples Detection and Analysis with Layer-wise Autoencoders

Authors :: Wójcik, Bartosz
Morawiecki, Paweł
Śmieja, Marek
Krzyżek, Tomasz
Spurek, Przemysław
Tabor, Jacek
Source :: 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI).
Publication Year :: 2021
Publisher :: IEEE, 2021.
Abstract: We present a mechanism for detecting adversarial examples based on data representations taken from the hidden layers of the target network. For this purpose, we train individual autoencoders at intermediate layers of the target network. This allows us to describe the manifold of true data and, in consequence, decide whether a given example has the same characteristics as true data. It also gives us insight into the behavior of adversarial examples and their flow through the layers of a deep neural network. Experimental results show that our method outperforms the state of the art in supervised and unsupervised settings.

Subjects :: FOS: Computer and information sciences
Computer Science - Machine Learning
Computer Science - Cryptography and Security
Statistics - Machine Learning
Machine Learning (stat.ML)
Cryptography and Security (cs.CR)
Machine Learning (cs.LG)

Database :: OpenAIRE
Journal :: 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI)
Accession number :: edsair.doi.dedup.....1d058fff6de9c780ffb6f79199f5d90d
Full Text :: https://doi.org/10.1109/ictai52525.2021.00209