pythontensorflowmachine-learningdeep-learningcleverhans

Generating adversarial data from cleverhans attack models


I want a code example to how to generate train data from clever hans' adversarial attacks.

adv_x = fgsm.generate_np(X_test, **fgsm_params)

This generates adversarial x data but how can I get y?

adv_pred = model.predict_classes(adv_x)

And this will give the "fooled" results right?

What I want is to correctly show generated x, y, fooled y (by which I mean results of models predictions that may be false because of the attack). I'm using Mnist btw, if it helps.


Solution

  • Based on the code snippets you shared, I would make two suggestions: