python-3.xmachine-learningkerascheckpointingvgg-net

Stop and Restart Training on VGG-16


I am using pre-trained VGG-16 model for image classification. I am adding custom last layer as the number of my classification classes are 10. I am training the model for 200 epochs.

My question is: is there any way if I randomly stop (by closing python window) the training at some epoch, let's say epoch no. 50 and resume from there? I have read about saving and reloading model but my understanding is that works for our custom models only instead of pre-trained models like VGG-16.


Solution

  • You can use ModelCheckpoint callback to save your model regularly. To use it, pass a callbacks parameter to the fit method:

    from keras.callbacks import ModelCheckpoint
    checkpointer = ModelCheckpoint(filepath='model-{epoch:02d}.hdf5', ...)
    model.fit(..., callbacks=[checkpointer])
    

    Then, later you can load the last saved model. For more customization of this callback take a look at the documentation.