[SOLVED] How to determine the value of early_stopping_patience in HuggingFace's Seq2SeqTrainer EarlyStoppingCallback?

How to determine the value of early_stopping_patience in HuggingFace's Seq2SeqTrainer EarlyStoppingCallback?

In my Seq2SeqTrainer, I use EarlyStoppingCallback to stop the training process when the criteria has been met.

trainer = Seq2SeqTrainer(
    model = model,
    args = training_args,
    train_dataset = train_set,
    eval_dataset = eval_set,
    tokenizer = tokenizer,
    data_collator = data_collator,
    compute_metrics = compute_metrics,
    callbacks = [EarlyStoppingCallback(early_stopping_patience=1)]
)

Currently, I am using the default value 1 in the early_stopping_patience, but how can I determine which value I should use? The official documentation doesn't say much.

early_stopping_patience (int) — Use with metric_for_best_model to stop training when the specified metric worsens for early_stopping_patience evaluation calls.

Also, could I use Epoch instead of Step in evaluation_strategy with this EarlyStoppingCallback()?

Thanks in advance.

Solution

Early stopping patience dictates how much you're willing to wait for your model to improve before stopping training: it is a tradeoff between training time and performance (as in getting a good metric).

Setting a patience of 1 is generally not a good idea as your metric can locally worsen before improving again. You could set a patience of 3 if it takes a lot of time between two evalautions, 5 if they are more frequent, and even more if you can afford the compute time.

For your second question, this callback is compatible with evaluation_strategy=epoch (it is actually the setting I've seen the most in the past).