I'm trying to use the Elastic-Net algorithm implemented in Cleverhans to generate adversarial samples in a classification task. The main problem is that i'm trying to use it in a way to obtain an higher confidence at classification time on a target class (different from the original one) but i'm not able to reach good results. The system that i'm trying to fool is a DNN with a softmax output on 10 classes.
For instance:
This is the code that i'm actually using to generate adversarial samples.
ead_params = { 'binary_search_steps':9, 'max_iterations':100 , 'learning_rate':0.001, 'clip_min':0,'clip_max':1,'y_target':target}
adv_x = image
founded_adv = False
threshold = 0.9
wrap = KerasModelWrapper(model)
ead = ElasticNetMethod(wrap, sess=sess)
while (not founded_adv):
adv_x = ead.generate_np(adv_x, **ead_params)
prediction = model.predict(adv_x).tolist()
pred_class = np.argmax(prediction[0])
confidence = prediction[0][pred_class]
if (pred_class == 0 and confidence >= threshold):
founded_adv = True
The while loop may generate a sample until the target class is reached with a confidence greater than 90%. This code actually works with FGSM and Madry, but runs infinitely using EAD.
Library version:
Tensorflow: 2.2.0 Keras: 2.4.3 Cleverhans: 2.0.0-451ccecad450067f99c333fc53592201
Anyone can help me ?
Thanks a lot.
For anyone intrested in this problem the previous code can be modified in this way to works properly:
FIRST SOLUTION:
prediction = model.predict(image)
initial_predicted_class = np.argmax(prediction[0])
ead_params = { 'binary_search_steps':9, 'max_iterations':100 , 'learning_rate':0.001,'confidence':1, 'clip_min':0,'clip_max':1,'y_target':target}
adv_x = image
founded_adv = False
threshold = 0.9
wrap = KerasModelWrapper(model)
ead = ElasticNetMethod(wrap, sess=sess)
while (not founded_adv):
adv_x = ead.generate_np(adv_x, **ead_params)
prediction = model.predict(adv_x).tolist()
pred_class = np.argmax(prediction[0])
confidence = prediction[0][pred_class]
if (pred_class == initial_pred_class and confidence >= threshold):
founded_adv = True
else:
ead_params['confidence'] += 1
Using the confidence parameter implemented in the library. Actually we increase by 1 the confidence parameter if the probability of the target class does not increase.
SECOND SOLUTION :
prediction = model.predict(image)
initial_predicted_class = np.argmax(prediction[0])
ead_params = {'beta':5e-3 , 'binary_search_steps':6, 'max_iterations':10 , 'learning_rate':3e-2, 'clip_min':0,'clip_max':1}
threshold = 0.96
adv_x = image
founded_adv = False
wrap = KerasModelWrapper(model)
ead = ElasticNetMethod(wrap, sess=sess)
while (not founded_adv):
eps_hyp = 0.5
new_adv_x = ead.generate_np(adv_x, **ead_params)
pert = new_adv_x-adv_x
new_adv_x = adv_x - eps_hyp*pert
new_adv_x = (new_adv_x - np.min(new_adv_x)) / (np.max(new_adv_x) - np.min(new_adv_x))
adv_x = new_adv_x
prediction = model.predict(new_adv_x).tolist()
pred_class = np.argmax(prediction[0])
confidence = prediction[0][pred_class]
print(pred_class)
print(confidence)
if (pred_class == initial_predicted_class and confidence >= threshold):
founded_adv = True
In the second solution there are the following modification to the original code:
-Initial_predicted_class is the class predicted by the model on the benign sample ( "0" for our example ).
-In the parameters of the algorithm (ead_params) we don't insert the target class.
-Then we can obtain the perturbation given by the algorithm calculating pert = new_adv_x - adv_x where "adv_x" is the original image (in the first step of the for loop), and new_adv_x is the perturbed sample generated by the algorithm.
-The previous operation is useful because the EAD original alghoritm calculate the perturbation to maximize the loss w.r.t the class "0", but in our case we want to minimize it.
-So, we can calculate the new perturbed image as new_adv_x = adv_x - eps_hyp*pert (where the eps_hyp is an epsilon hyperparameter that i've introduced to reduce the perturbation), and than we normalize the new perturbed image.
-I've tested the code for a large number of images, and the the confidence always increase, so i think that can be a good solution for this purpose.
I think that the second solution allow to obtain finer perturbation.