Towards unlocking the mystery of adversarial fragility of neural networks

Authors :: Gao, Jingchao
Mudumbai, Raghu
Wu, Xiaodong
Yi, Jirong
Xu, Catherine
Xie, Hui
Xu, Weiyu
Publication Year :: 2024
Abstract: In this paper, we study the adversarial robustness of deep neural networks for classification tasks. We look at the smallest magnitude of possible additive perturbations that can change the output of a classification algorithm. We provide a matrix-theoretic explanation of the adversarial fragility of deep neural network for classification. In particular, our theoretical results show that neural network's adversarial robustness can degrade as the input dimension $d$ increases. Analytically we show that neural networks' adversarial robustness can be only $1/\sqrt{d}$ of the best possible adversarial robustness. Our matrix-theoretic explanation is consistent with an earlier information-theoretic feature-compression-based explanation for the adversarial fragility of neural networks.<br />Comment: 21 pages