pythonjupyter-notebookjupyternbconvertpapermill

Can I parse the content of a Jupyter Notebook cells in a script?


Is it possible to extract the content of a Jupyter notebook input cell programatically? Be that raw cell / code / Markdown, does not matter really. I was thinking of tools like nbconvert or papermill but could not find exactly what I am looking for... I would like to write a script which will essentially parse a notebook...

Is is possible to parse the output cells too?


Solution

  • The Jupyter ecosystem includes nbformat for this this task.

    The intro at the top of here will probably help you see how nbformat is the tool you seek. Importantly, the abstractions of the notebook & cells & types of cells is all baked in so that you don't have to worry about json parsing really.

    I have an answer that includes code that does this for the text in code cells programmatically in reply to a post entitled 'How to copy multiple input cells in Jupyter Notebook'.

    I have several additional examples with code among questions/answers linked here and here.