python-3.xtensorflow2.0tensorflow-federatedfederated-learning

Why can't I see the local epochs output when training tensorflow federated learning model?


I am training a tensorflow federated learning model. I cannot see the output of epochs. Details are as follows:

split = 4
NUM_ROUNDS = 5
NUM_EPOCHS = 10
BATCH_SIZE = 2
PREFETCH_BUFFER = 5

for round_num in range(1, NUM_ROUNDS+1):
    state, tff_metrics = iterative_process.next(state, federated_train_data) 
    print('round {:2d}, metrics{}'.format(round_num,tff_metrics['train'].items()))
    
    eval_model = create_keras_model()
    eval_model.compile(optimizer=optimizers.Adam(learning_rate=client_lr),
                       loss=losses.BinaryCrossentropy(),
                       metrics=[tf.keras.metrics.Accuracy()])
    
    #tff.learning.assign_weights_to_keras_model(eval_model, state.model)
    state.model.assign_weights_to(eval_model)
    
    ev_result = eval_model.evaluate(x_val, y_val, verbose=2)
    train_metrics = tff_metrics['train']
      for name, value in tff_metrics['train'].items():
            tf.summary.scalar(name,value, step=round_num)
    
    tff_val_acc.append(ev_result[1])
    tff_val_loss.append(ev_result[0])

And my output looks as follows:


    round  1, metrics=odict_items([('accuracy', 0.0), ('loss', 1.2104079)])
    1/1 - 1s - loss: 0.7230 - accuracy: 0.0000e+00 - 1s/epoch - 1s/step
    round  2, metrics=odict_items([('accuracy', 0.0007142857), ('loss', 1.2233553)])
    1/1 - 1s - loss: 0.6764 - accuracy: 0.0000e+00  - 646ms/epoch - 646ms/step
    round  3, metrics=odict_items([('accuracy', 0.0),  ('loss', 1.1939998)])
    1/1 - 1s - loss: 0.6831 - accuracy: 0.0000e+00  - 635ms/epoch - 635ms/step
    round  4, metrics=odict_items([('accuracy', 0.0), ('loss', 1.2829995)])
    1/1 - 1s - loss: 0.6830 - accuracy: 0.0000e+00  - 641ms/epoch - 641ms/step
    round  5, metrics=odict_items([('accuracy', 0.0),  ('loss', 1.2051892)])
    1/1 - 1s - loss: 0.7135 - accuracy: 0.0000e+00 - 621ms/epoch - 621ms/step

Are these values for global model after each round? How can I plot the curves for validation accuracy of the global model for the 100 epochs (10 rounds, 10 local epochs per round)? (Not in tensorboard)


Solution

  • Why can't I see the local epochs output when training tensorflow federated learning model?

    Generally in federated learning the client is performing local computation not visible to the server. In this case, the server (or us modelers) only see the the result of that local training (not the individual epochs).

    Are these values for global model after each round?

    Yes, the logging is statements are a mix of both training and validation metrics of the global model after each round. Note that the training metrics in federated learning have a subtle peculiarity.

    round  1, metrics=odict_items([('accuracy', 0.0), ('loss', 1.2104079)])
    

    these lines are the training metrics, and are being produced by the code:

    print('round {:2d}, metrics{}'.format(round_num,tff_metrics['train'].items()))
    

    The validation metrics are being printed by Keras, these logging statements:

     1/1 - 1s - loss: 0.7230 - accuracy: 0.0000e+00 - 1s/epoch - 1s/step
    

    are being printed by this line:

    ev_result = eval_model.evaluate(x_val, y_val, verbose=2)
    

    How can I plot the curves for validation accuracy of the global model for the 100 epochs (10 rounds, 10 local epochs per round)?

    The tff_val_acc and tff_val_loss lists should have the validation metric values, indexed by round number. Using a library such as matplotlib (https://matplotlib.org/) could be an option for plotting these curves?