pythonjsonpython-3.xgaussian-processgpy

How to Save/Load Optimized GPy Regression Model


I'm trying to save my optimized Gaussian process model for use in a different script. My current line of thinking is to store the model information in a json file, utilizing GPy's built-in to_dict and from_dict functions. Something along the lines of:

import GPy
import numpy as np
import json

X = np.random.uniform(-3.,3.,(20,1))
Y = np.sin(X) + np.random.randn(20,1)*0.05
kernel = GPy.kern.RBF(input_dim=1, variance=1., lengthscale=1.)

m = GPy.models.GPRegression(X, Y, kernel)

m.optimize(messages=True)
m.optimize_restarts(num_restarts = 10)

jt = json.dumps(m.to_dict(save_data=False), indent=4)
with open("j-test.json", 'w') as file:
    file.write(jt)

This step works with no issues, but I run into problems when I try to load the model information using :

with open("j-test.json", 'r') as file:
    d = json.load(file)  # d is a dictionary

m2 = GPy.models.GPClassification.from_dict(d, data=None)

which gives me an assertion error because "data is not None", which it is -- or at least I think so. assertion error

I'm really new to GPy and using jsons, so I'm really not sure where I've gone astray. I tried looking into the documentation, but the documentation is a bit vague and I couldn't find an example of its use. Is there a step/concept that I missed? Also, is this the best way to store and reload my model? Any help with this would be greatly appreciated! Thanks!


Solution

  • The module pickle is your friend here!

    import pickle
    with open('save.pkl', 'wb') as file:
        pickle.dump(m, file)
    

    you can call it back in a future script with:

    with open('save.pkl', 'rb') as file:
        loaded_model = pickle.load(file)