I am training my Keras model on Google Colab TPU as follows -
adam = Adam(lr=0.002)
model.compile(loss='mse', metrics=[PSNRLoss, SSIMLoss], optimizer=adam)
checkpoint = ModelCheckpoint("model_{epoch:02d}.hdf5", monitor='loss', verbose=1, save_best_only=True,
mode='min')
reduce_lr = ReduceLROnPlateau(monitor='loss', factor=0.5,
patience=5, min_lr=0.00002)
csv_logger = CSVLogger('history.log')
callbacks_list = [checkpoint,reduce_lr,csv_logger]
model.fit(traindb, batch_size=1024,
callbacks=callbacks_list,shuffle=True,epochs=100, verbose=2, validation_data = validdb)
During training, my learning rate is decreased by a factor of 0.5 even when the loss is improving with the current value of learning rate. As you can see in the snippet below learning rate decreases from 0.0020 to 0.0010 to 0.0005.
Epoch 00011: loss improved from 0.00647 to 0.00646, saving model to ./models_x4/no_noise/dcscn_x2_11.hdf5
1939/1939 - 109s - PSNRLoss: 23.7280 - loss: 0.0065 - SSIMLoss: 0.3329 - val_PSNRLoss: 23.9022 - val_loss: 0.0066 - val_SSIMLoss: 0.3815 - lr: 0.0020
Epoch 12/100
Epoch 00012: loss improved from 0.00646 to 0.00645, saving model to ./models_x4/no_noise/dcscn_x2_12.hdf5
1939/1939 - 111s - PSNRLoss: 23.7245 - loss: 0.0065 - SSIMLoss: 0.3331 - val_PSNRLoss: 23.9397 - val_loss: 0.0066 - val_SSIMLoss: 0.3705 - lr: 0.0020
Epoch 13/100
Epoch 00013: loss improved from 0.00645 to 0.00644, saving model to ./models_x4/no_noise/dcscn_x2_13.hdf5
1939/1939 - 110s - PSNRLoss: 23.7300 - loss: 0.0064 - SSIMLoss: 0.3321 - val_PSNRLoss: 23.9827 - val_loss: 0.0065 - val_SSIMLoss: 0.3745 - lr: 0.0020
Epoch 14/100
Epoch 00014: loss improved from 0.00644 to 0.00643, saving model to ./models_x4/no_noise/dcscn_x2_14.hdf5
1939/1939 - 111s - PSNRLoss: 23.7279 - loss: 0.0064 - SSIMLoss: 0.3376 - val_PSNRLoss: 23.9079 - val_loss: 0.0066 - val_SSIMLoss: 0.3959 - lr: 0.0020
Epoch 15/100
Epoch 00015: loss improved from 0.00643 to 0.00634, saving model to ./models_x4/no_noise/dcscn_x2_15.hdf5
1939/1939 - 110s - PSNRLoss: 23.8356 - loss: 0.0063 - SSIMLoss: 0.3408 - val_PSNRLoss: 23.7063 - val_loss: 0.0067 - val_SSIMLoss: 0.3799 - lr: 0.0010
Epoch 16/100
Epoch 00016: loss did not improve from 0.00634
1939/1939 - 107s - PSNRLoss: 23.8173 - loss: 0.0063 - SSIMLoss: 0.3398 - val_PSNRLoss: 23.7282 - val_loss: 0.0067 - val_SSIMLoss: 0.3853 - lr: 0.0010
Epoch 17/100
Epoch 00017: loss did not improve from 0.00634
1939/1939 - 110s - PSNRLoss: 23.8199 - loss: 0.0063 - SSIMLoss: 0.3426 - val_PSNRLoss: 23.7202 - val_loss: 0.0067 - val_SSIMLoss: 0.4082 - lr: 0.0010
Epoch 18/100
Epoch 00018: loss did not improve from 0.00634
1939/1939 - 110s - PSNRLoss: 23.8138 - loss: 0.0063 - SSIMLoss: 0.3393 - val_PSNRLoss: 23.7523 - val_loss: 0.0066 - val_SSIMLoss: 0.4037 - lr: 0.0010
Epoch 19/100
Epoch 00019: loss improved from 0.00634 to 0.00634, saving model to ./models_x4/no_noise/dcscn_x2_19.hdf5
1939/1939 - 110s - PSNRLoss: 23.8189 - loss: 0.0063 - SSIMLoss: 0.3406 - val_PSNRLoss: 23.7188 - val_loss: 0.0067 - val_SSIMLoss: 0.4115 - lr: 0.0010
Epoch 20/100
Epoch 00020: loss improved from 0.00634 to 0.00634, saving model to ./models_x4/no_noise/dcscn_x2_20.hdf5
1939/1939 - 108s - PSNRLoss: 23.8176 - loss: 0.0063 - SSIMLoss: 0.3407 - val_PSNRLoss: 23.7692 - val_loss: 0.0066 - val_SSIMLoss: 0.3883 - lr: 0.0010
Epoch 21/100
Epoch 00021: loss improved from 0.00634 to 0.00627, saving model to ./models_x4/no_noise/dcscn_x2_21.hdf5
1939/1939 - 108s - PSNRLoss: 23.8889 - loss: 0.0063 - SSIMLoss: 0.3478 - val_PSNRLoss: 24.0306 - val_loss: 0.0064 - val_SSIMLoss: 0.3544 - lr: 5.0000e-04
Epoch 22/100
Epoch 00022: loss improved from 0.00627 to 0.00627, saving model to ./models_x4/no_noise/dcscn_x2_22.hdf5
1939/1939 - 109s - PSNRLoss: 23.8847 - loss: 0.0063 - SSIMLoss: 0.3466 - val_PSNRLoss: 24.0461 - val_loss: 0.0064 - val_SSIMLoss: 0.3679 - lr: 5.0000e-04
Thanking you in anticipation :) Please suggest where am I going wrong ? Should I monitor some other appropriate value.
ReduceLROnPlateau
object has an argument called min_delta
which is a threshold for measuring the new optimum. The default value of min_delta
is 0.0001
. So, although your log output says that loss improved, this improvement is avoided if it is less than min_delta
. Therefore, after patience
epochs, the learning rate is decreased.