Problem of understanding in the FGSM attack #1545

Bulby-Bull · 2022-02-13T16:10:00Z

Bulby-Bull
Feb 13, 2022

Hi,

I've been trying to figure out how the targeted and untargeted mode works in FGSM and I'm having some trouble.

Context: I created an MLP model with Keras to do a binary classification with 1 hidden layer (ReLU) and an output layer in SoftMax with 2 neurons. The dataset used is 2 class of data having 2 features based on a binomial distribution. I provide the code at this URL:
Colab example

After training the model using the crossentropy loss function, I used FGSM to create adversarial examples to see the "targeted" and "untargeted" behaviour. I provide you with my results here, which are obviously specific to my model, so you may get slightly different results.
In the table, L2 and Linf represents the norms, epsilon is the parameter the generate function, Target C0 and Target C1 are the targeted classes given as the target parameter (y) in the generate function and true labels are the original y_test from the dataset.

Questions:

Why, when using the L2 norm and passing the target C0, does class 0 go to class 1 (uncorrectly) while class 1 correctly goes to class 0?

Why, still in the L2 norm, does class 0 tend to go only half way to class 1 or class 0 regardless of the target, and even in untargeted mode, while class 1 goes correctly?

Why, in untargeted mode, the parameter y is taken into consideration while is supposed to be neutral? (when y = class 0 or class 1, the results are different).

Can you explain the behavior of the untargeted function because it seems that it's complementary to the targeted function but it provides some inconsistencies (in targeted mode, if we pass the y_test in parameters of generate, it's equivalent to the reverse of y_test in parameters in the untargeted mode).

Thank you in advance, if any questions are unclear, please feel free to ask me for more information.

Benjamin.

beat-buesser · 2022-02-15T00:45:09Z

beat-buesser
Feb 15, 2022
Maintainer

Hi @Bulby-Bull Thank you very much for using ART! That's an interesting set of experiments.

About your first question, why do you think that "by passing the target C0, class 0 goes to class 1 (uncorrectly) while class 1 correctly goes to class 0"?

2 replies

Bulby-Bull Feb 17, 2022
Author

Thanks

"About your first question, why do you think that "by passing the target C0, class 0 goes to class 1 (uncorrectly) while class 1 correctly goes to class 0"?"

Because, when i pass the class 0 as the target in the targeted mode, the results show that the class 1 go to the class 0 and the class 0 go to the class 1 (which is therefore incorrect). If i pass the class 1 as the target, in this case the class 0 go to the class 1 and the class 1 goes even further towards class 1 (which is understandable). Why these two behaviors are not complementary ? Why the class 0 move when it's the target contrary to the class 1 that correctly move in the same direction?

beat-buesser Feb 18, 2022
Maintainer

Hi @Bulby-Bull I think this behaviour of attacking class 0 with target class 0 is likely to be caused by numerical noise where the binary classification loss is almost zero because the model is very confident to be in the correct class but the backpropagation of the loss gradients still leads to numerically non-zero values which the single-step attack FGSM scales to a rather larger step, sometimes into a random direction because of the random numerical noise. It also does not always occur, repeating the notebook some runs look more like:

ART provides AutoAttack which can wrap all other attacks to make sure that the wrapped attack only attacks samples that are not yet in the desired class.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Problem of understanding in the FGSM attack #1545

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Problem of understanding in the FGSM attack #1545

Uh oh!

Uh oh!

Bulby-Bull Feb 13, 2022

Replies: 1 comment · 2 replies

Uh oh!

beat-buesser Feb 15, 2022 Maintainer

Uh oh!

Bulby-Bull Feb 17, 2022 Author

Uh oh!

beat-buesser Feb 18, 2022 Maintainer

Bulby-Bull
Feb 13, 2022

Replies: 1 comment 2 replies

beat-buesser
Feb 15, 2022
Maintainer

Bulby-Bull Feb 17, 2022
Author

beat-buesser Feb 18, 2022
Maintainer