tensorflowvalidationearly-stopping

Tensorflow EarlyStopping Stops too early


I have the following earlystopping, but it stops too soon. I am wondering if it considers loss improvement when val_ndcg_metric decreases (which should not be the case, as the bigger ndcg, the better).

early_stopping = EarlyStopping(monitor='val_ndcg_metric', 
                               patience = 5,
                               restore_best_weights = True, 
                               min_delta = 0.001,
                               mode='auto',
                               verbose=2,
                               baseline=None)
    
model.fit(cached_train, 
          epochs=epochs, 
          verbose=True,
          validation_data=cached_validation,
          callbacks=[early_stopping])

Here's the results:

Epoch 1/100
287/287 [==============================] - 68s 220ms/step - ndcg_metric: 0.7687 - root_mean_squared_error: 0.7584 - loss: 19.7870 - regularization_loss: 0.0000e+00 - total_loss: 19.7870 - val_ndcg_metric: 0.8302 - val_root_mean_squared_error: 1.1678 - val_loss: 19.6306 - val_regularization_loss: 0.0000e+00 - val_total_loss: 19.6306
Epoch 2/100
287/287 [==============================] - 62s 215ms/step - ndcg_metric: 0.8403 - root_mean_squared_error: 1.6596 - loss: 19.6016 - regularization_loss: 0.0000e+00 - total_loss: 19.6016 - val_ndcg_metric: 0.8659 - val_root_mean_squared_error: 2.0399 - val_loss: 19.4413 - val_regularization_loss: 0.0000e+00 - val_total_loss: 19.4413
Epoch 3/100
287/287 [==============================] - 62s 216ms/step - ndcg_metric: 0.8679 - root_mean_squared_error: 2.1857 - loss: 19.4620 - regularization_loss: 0.0000e+00 - total_loss: 19.4620 - val_ndcg_metric: 0.8874 - val_root_mean_squared_error: 2.2495 - val_loss: 19.2740 - val_regularization_loss: 0.0000e+00 - val_total_loss: 19.2740
Epoch 4/100
287/287 [==============================] - 62s 215ms/step - ndcg_metric: 0.8861 - root_mean_squared_error: 2.2456 - loss: 19.3463 - regularization_loss: 0.0000e+00 - total_loss: 19.3463 - val_ndcg_metric: 0.8982 - val_root_mean_squared_error: 2.2170 - val_loss: 19.1935 - val_regularization_loss: 0.0000e+00 - val_total_loss: 19.1935
Epoch 5/100
287/287 [==============================] - 62s 215ms/step - ndcg_metric: 0.8945 - root_mean_squared_error: 2.2081 - loss: 19.2647 - regularization_loss: 0.0000e+00 - total_loss: 19.2647 - val_ndcg_metric: 0.9027 - val_root_mean_squared_error: 2.1765 - val_loss: 19.1420 - val_regularization_loss: 0.0000e+00 - val_total_loss: 19.1420
Epoch 6/100
287/287 [==============================] - 62s 216ms/step - ndcg_metric: 0.8987 - root_mean_squared_error: 2.1843 - loss: 19.2139 - regularization_loss: 0.0000e+00 - total_loss: 19.2139 - val_ndcg_metric: 0.9060 - val_root_mean_squared_error: 2.1654 - val_loss: 19.0738 - val_regularization_loss: 0.0000e+00 - val_total_loss: 19.0738
Restoring model weights from the end of the best epoch.
Epoch 00006: early stopping
277/277 [==============================] - 24s 88ms/step - ndcg_metric: 0.8323 - root_mean_squared_error: 1.1680 - loss: 19.6501 - regularization_loss: 0.0000e+00 - total_loss: 19.6501

I would appreciate any thoughts on this.


Solution

  • I do not know what the val_ndcg_metric is but apparently you want it to increase as the model trains. In the callback you set mode='auto'. Try setting mode='max'. This will halt training if the value of the val_ndcg_metric stops increasing for a patience number of epochs.