A3T: accuracy aware adversarial training

Enes Altinisik*, Safa Messaoud, Husrev Taha Sencar, Sanjay Chawla

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Adversarial training has been empirically shown to be more prone to overfitting than standard training. The exact underlying reasons are still not fully understood. In this paper, we identify one cause of overfitting related to current practices of generating adversarial examples from misclassified samples. We show that, following current practice, adversarial examples from misclassified samples results in harder-to-classify samples than the original ones. This leads to a complex adjustment of the decision boundary during training and hence overfitting. To mitigate this issue, we propose A3T, an accuracy aware AT method that generate adversarial example differently for misclassified and correctly classified samples. We show that our approach achieves better generalization while maintaining comparable robustness to state-of-the-art AT methods on a wide range of computer vision, natural language processing, and tabular tasks.

Original languageEnglish
Pages (from-to)3191-3210
Number of pages20
JournalMachine Learning
Volume112
Issue number9
DOIs
Publication statusPublished - Sept 2023

Keywords

  • Accuracy aware adversarial training
  • Adversarial training
  • Overfitting in adversarial training

Fingerprint

Dive into the research topics of 'A3T: accuracy aware adversarial training'. Together they form a unique fingerprint.

Cite this