pythongoogle-cloud-platformjupyter-notebookgcp-ai-platform-notebook

automatic reloading of jupyter notebook after crash


is there a way to reload automatically a jupyter notebook, each time it crashes ?

I am actually running a notebook, that trains a Deep learning model (the notebook can reload the last state of model, with state of optimizer and scheduler, after each restart of the kernel ), so that reloading the notebook after a crash enables to get back the last state without a substantial loss of computations.

I was wondering if there was a simple way to do that using the jupyter notebook API, or a signal from the jupyter notebook for example (maybe on logs).

Also, I am running the notebook on google cloud platform (on compute engine), if you know any efficient way to do it, using the GCP troubleshooting services, and the logging agent, it might be interested for me and for others with the same issue.

Thank you again for you time.

I tried to look up for a solution on stack overflow, but I didn't find any similar question.


Solution

  • From your comment:

    "reloading the notebook after a crash enables to get back the last state without a substantial loss of computations."

    What do you call a crash?, does it generate logs that can be parsed from /var/log or other location (e.g journalctl -u jupyter.service) ? If so you can manually create a shell script.

    With User Managed Notebooks you have the concept of post-startup-script or startup-script

    post-startup-script, is path to a Bash script that automatically runs after a notebook instance fully boots up. The path must be a URL or Cloud Storage path. Example: "gs://path-to-file/file-name"

    This script can be a loop that monitors the crash you mention