python-2.7bayesianpymcpymc3hierarchical-bayesian

Saving data from traceplot in PyMC3


Below is the code for a simple Bayesian Linear regression. After I obtain the trace and the plots for the parameters, is there any way in which I can save the data that created the plots in a file so that if I need to plot it again I can simply plot it from the data in the file rather than running the whole simulation again?

import pymc3 as pm
import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0,9,5)
y = 2*x + 5
yerr=np.random.rand(len(x))

def soln(x, p1, p2):
    return p1+p2*x

with pm.Model() as model:
    # Define priors
    intercept = pm.Normal('Intercept', 15, sd=5)
    slope = pm.Normal('Slope', 20, sd=5)
    # Model solution
    sol = soln(x, intercept, slope)
    # Define likelihood
    likelihood = pm.Normal('Y', mu=sol,
                        sd=yerr, observed=y)

    # Sampling

    trace = pm.sample(1000, nchains = 1)


pm.traceplot(trace)
print pm.summary(trace, ['Slope'])
print pm.summary(trace, ['Intercept'])
plt.show()

Solution

  • There are two easy ways of doing this:

    1. Use a version after 3.4.1 (currently this means installing from master, with pip install git+https://github.com/pymc-devs/pymc3). There is a new feature that allows saving and loading traces efficiently. Note that you need access to the model that created the trace:

      ...
      pm.save_trace(trace, 'linreg.trace') 
      
      # later
      with model:
         trace = pm.load_trace('linreg.trace') 
      
    2. Use cPickle (or pickle in python 3). Note that pickle is at least a little insecure, don't unpickle data from untrusted sources:

      import cPickle as pickle  # just `import pickle` on python 3
      
      ...
      with open('trace.pkl', 'wb') as buff:
          pickle.dump(trace, buff)
      
      #later
      with open('trace.pkl', 'rb') as buff:
          trace = pickle.load(buff)