pythonkedro

Is there IO functionality to store trained models in kedro?


In the IO section of the kedro API docs I could not find functionality w.r.t. storing trained models (e.g. .pkl, .joblib, ONNX, PMML)? Have I missed something?


Solution

  • There is the pickle dataset in kedro.io, that you can use to save trained models and/or anything you want to pickle and is serialisable (models being a common object). It accepts a backend that defaults to pickle but can be set to joblib if you want to use joblib instead.

    I'm just going to quickly note that Kedro is moving to kedro.extras.datasets for its datasets and moving away from having non-core datasets in kedro.io. You might want to look at kedro.extras.datasets and in Kedro 0.16 onwards pickle.PickleDataSet with joblib support.

    The Kedro spaceflights tutorial in the documentation actually saves the trained linear regression model using the pickle dataset if you want to see an example of it. The relevant section is here.