I am training RoBERTa model for a new language, and it takes some hours to train the data. So I think it is a good idea to save the model while training so that I can continue training the model from where it stops next time.
I am using torch library and google Colab GPU to train the model.
Here is my colab file. https://colab.research.google.com/drive/1jOYCaLdxYRwGMqMciG6c3yPYZAsZRySZ?usp=sharing
You can use the Trainer
from transformers to train the model. This trainer will also need you to specify the TrainingArguments
, which will allow you to save checkpoints of the model while training.
Some of the parameters you set when creating TrainingArguments
are:
save_strategy
: The checkpoint save strategy to adopt during training. Possible values are:
save_steps
: Number of updates steps before two checkpoint saves if save_strategy="steps".save_total_limit
: If a value is passed, will limit the total amount of checkpoints. Deletes the older checkpoints in output_dir.load_best_model_at_end
: Whether or not to load the best model found during training at the end of training.One important thing about load_best_model_at_end
is that when set to True, the parameter save_strategy
needs to be the same as eval_strategy
, and in the case it is “steps”, save_steps
must be a round multiple of eval_steps.