Back to Search Start Over

Deep neural rejection against adversarial examples

Authors :
Fabio Roli
Marco Melis
Ambra Demontis
Angelo Sotgiu
Giorgio Fumera
Xiaoyi Feng
Battista Biggio
Source :
EURASIP Journal on Information Security, Vol 2020, Iss 1, Pp 1-10 (2020)
Publication Year :
2020
Publisher :
SpringerOpen, 2020.

Abstract

Despite the impressive performances reported by deep neural networks in different application domains, they remain largely vulnerable to adversarial examples, i.e., input samples that are carefully perturbed to cause misclassification at test time. In this work, we propose a deep neural rejection mechanism to detect adversarial examples, based on the idea of rejecting samples that exhibit anomalous feature representations at different network layers. With respect to competing approaches, our method does not require generating adversarial examples at training time, and it is less computationally demanding. To properly evaluate our method, we define an adaptive white-box attack that is aware of the defense mechanism and aims to bypass it. Under this worst-case setting, we empirically show that our approach outperforms previously proposed methods that detect adversarial examples by only analyzing the feature representation provided by the output network layer.

Details

Language :
English
Volume :
2020
Issue :
1
Database :
OpenAIRE
Journal :
EURASIP Journal on Information Security
Accession number :
edsair.doi.dedup.....743a10554e491ad340e49f86d3c698c7
Full Text :
https://doi.org/10.1186/s13635-020-00105-y