Back to Search Start Over

Combating Adversaries with Anti-Adversaries

Authors :
Alfarra, Motasem
Pérez, Juan C.
Thabet, Ali
Bibi, Adel
Torr, Philip H. S.
Ghanem, Bernard
Publication Year :
2021

Abstract

Deep neural networks are vulnerable to small input perturbations known as adversarial attacks. Inspired by the fact that these adversaries are constructed by iteratively minimizing the confidence of a network for the true class label, we propose the anti-adversary layer, aimed at countering this effect. In particular, our layer generates an input perturbation in the opposite direction of the adversarial one and feeds the classifier a perturbed version of the input. Our approach is training-free and theoretically supported. We verify the effectiveness of our approach by combining our layer with both nominally and robustly trained models and conduct large-scale experiments from black-box to adaptive attacks on CIFAR10, CIFAR100, and ImageNet. Our layer significantly enhances model robustness while coming at no cost on clean accuracy.<br />Comment: Accepted to AAAI Conference on Artificial Intelligence (AAAI'22)

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2103.14347
Document Type :
Working Paper