I need to apply an exponential decay of learning rate every 10 epochs. Initial learning rate is 0.000001
, and decay factor is 0.95
is this the proper way to set it up?
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
initial_learning_rate=0.000001,
decay_steps=(my_steps_per_epoch*10),
decay_rate=0.05)
opt = tf.keras.optimizers.SGD(learning_rate=lr_schedule, momentum=0.9)
The formula of exponential decay is current_lr = initial_lr * (1 - decay_factor)^t
Except that in the code it is implemented as :
decayed_learning_rate = learning_rate *
decay_rate ^ (global_step / decay_steps)
To my knowledge, decay_rate
should be 1 - decay_factor
and decay_steps
should mean how many steps are performed before applying the decay, in my case my_steps_per_epoch*10
. Is that correct?
EDIT:
If I pause and save my model (using callbacks) after the 10th epoch, and then resume by loading the model and calling model.fit
with initial_epoch=10
and epochs=11
, will it start in the 11th epoch and apply the exponential decay?
decay_steps
can be used to state after how many steps (processed batches) you will decay the learning rate. I find it quite useful to just specify the initial and the final learning rate and calculate the decay_factor automatically via the following:
initial_learning_rate = 0.1
final_learning_rate = 0.0001
learning_rate_decay_factor = (final_learning_rate / initial_learning_rate)**(1/epochs)
steps_per_epoch = int(train_size/batch_size)
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
initial_learning_rate=initial_learning_rate,
decay_steps=steps_per_epoch,
decay_rate=learning_rate_decay_factor,
staircase=True)