Skip to main content

Research Repository

Advanced Search

Not So Robust after All: Evaluating the Robustness of Deep Neural Networks to Unseen Adversarial Attacks

Garaev, Roman; Rasheed, Bader; Khan, Adil Mehmood

Authors

Roman Garaev

Bader Rasheed



Abstract

Deep neural networks (DNNs) have gained prominence in various applications, but remain vulnerable to adversarial attacks that manipulate data to mislead a DNN. This paper aims to challenge the efficacy and transferability of two contemporary defense mechanisms against adversarial attacks: (a) robust training and (b) adversarial training. The former suggests that training a DNN on a data set consisting solely of robust features should produce a model resistant to adversarial attacks. The latter creates an adversarially trained model that learns to minimise an expected training loss over a distribution of bounded adversarial perturbations. We reveal a significant lack in the transferability of these defense mechanisms and provide insight into the potential dangers posed by L∞-norm attacks previously underestimated by the research community. Such conclusions are based on extensive experiments involving (1) different model architectures, (2) the use of canonical correlation analysis, (3) visual and quantitative analysis of the neural network’s latent representations, (4) an analysis of networks’ decision boundaries and (5) the use of equivalence of L2 and L∞ perturbation norm theories.

Citation

Garaev, R., Rasheed, B., & Khan, A. M. (2024). Not So Robust after All: Evaluating the Robustness of Deep Neural Networks to Unseen Adversarial Attacks. Algorithms, 17, Article 162. https://doi.org/10.3390/a17040162

Journal Article Type Article
Acceptance Date Apr 15, 2024
Online Publication Date Apr 19, 2024
Publication Date 2024
Deposit Date Apr 19, 2024
Publicly Available Date Apr 22, 2024
Journal Algorithms
Print ISSN 1999-4893
Electronic ISSN 1999-4893
Publisher MDPI
Peer Reviewed Peer Reviewed
Volume 17
Article Number 162
DOI https://doi.org/10.3390/a17040162
Keywords machine learning; deep learning; adversarial attacks
Public URL https://hull-repository.worktribe.com/output/4627382

Files





You might also like



Downloadable Citations