tensorflowcleverhans

Adversarially Robust Googlenet model


How train a googlenet model adversarially on an own image classification dataset?

For example: Using cleverhans library, the data that has batches to run the attacks on are MNIST and CIFAR.

I trained an image classifier with my own data (Googlenet) using Tensorflow, now I want to train the model with the adversarial examples. Any ideas that I can do with the cleverhans library. Thanks.


Solution

  • The easiest is probably to start from your own code to train GoogleNet and modify its loss. You can find an example modification of the loss that adds a penalty to train on adversarial examples in the CleverHans tutorial. It uses the loss implementation found here to define a weighted average between the cross-entropy on clean images and the cross-entropy on adversarial images.