I am doing multi-class classification for a recommender system (item recommendations), and I'm currently training my network using sparse_categorical_crossentropy
loss. Therefore, it is reasonable to perform EarlyStopping
by monitoring my validation loss, val_loss
as such:
tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=10)
which works as expected. However, the performance of the network (recommender system) is measured by Average-Precision-at-10, and is tracked as a metric during training, as average_precision_at_k10
. Because of this, I could also perform early stopping with this metric as such:
tf.keras.callbacks.EarlyStopping(monitor='average_precision_at_k10', patience=10)
which also works as expected.
My problem: Sometimes the validation loss increases, whilst the Average-Precision-at-10 is improving and vice-versa. Because of this, I would need to monitor both, and perform early stopping, if and only if both are deteriorating. What I would like to do:
tf.keras.callbacks.EarlyStopping(monitor=['val_loss', 'average_precision_at_k10'], patience=10)
which obviously does not work. Any ideas how this could be done?
With guidance from Gerry P above I managed to create my own custom EarlyStopping callback, and thought I post it here in case anyone else are looking to implement something similar.
If both the validation loss and the mean average precision at 10 does not improve for patience
number of epochs, early stopping is performed.
class CustomEarlyStopping(keras.callbacks.Callback):
def __init__(self, patience=0):
super(CustomEarlyStopping, self).__init__()
self.patience = patience
self.best_weights = None
def on_train_begin(self, logs=None):
# The number of epoch it has waited when loss is no longer minimum.
self.wait = 0
# The epoch the training stops at.
self.stopped_epoch = 0
# Initialize the best as infinity.
self.best_v_loss = np.Inf
self.best_map10 = 0
def on_epoch_end(self, epoch, logs=None):
v_loss=logs.get('val_loss')
map10=logs.get('val_average_precision_at_k10')
# If BOTH the validation loss AND map10 does not improve for 'patience' epochs, stop training early.
if np.less(v_loss, self.best_v_loss) and np.greater(map10, self.best_map10):
self.best_v_loss = v_loss
self.best_map10 = map10
self.wait = 0
# Record the best weights if current results is better (less).
self.best_weights = self.model.get_weights()
else:
self.wait += 1
if self.wait >= self.patience:
self.stopped_epoch = epoch
self.model.stop_training = True
print("Restoring model weights from the end of the best epoch.")
self.model.set_weights(self.best_weights)
def on_train_end(self, logs=None):
if self.stopped_epoch > 0:
print("Epoch %05d: early stopping" % (self.stopped_epoch + 1))
It is then used as:
model.fit(
x_train,
y_train,
batch_size=64,
steps_per_epoch=5,
epochs=30,
verbose=0,
callbacks=[CustomEarlyStopping(patience=10)],
)