python-3.xbinary-datastanpystanprobabilistic-programming

How to store PyStan object as a binary?


I want to store the intermediate files in Probabilistic Programming steps with Stan such as fit object, see the SWE below, into a file so I can load it later for later usage. Stan compiles the models in C++ and after each run, I would not want to rerun the models again, I would like to store them to the filesystem for later analysis.

What is the best way store Stan objects with PyStan? In other words, how can I store the stan objects as binary and what is the most feasible way to store the results so no need to run them again later?

Small working example (source here)

schools_code = """
data {
    int<lower=0> J; // number of schools
    real y[J]; // estimated treatment effects
    real<lower=0> sigma[J]; // s.e. of effect estimates
}
parameters {
    real mu;
    real<lower=0> tau;
    real eta[J];
}
transformed parameters {
    real theta[J];
    for (j in 1:J)
    theta[j] = mu + tau * eta[j];
}
model {
    eta ~ normal(0, 1);
    y ~ normal(theta, sigma);
}
"""

schools_dat = {'J': 8,
               'y': [28,  8, -3,  7, -1,  1, 18, 12],
               'sigma': [15, 10, 16, 11,  9, 11, 10, 18]}

sm = pystan.StanModel(model_code=schools_code)
fit = sm.sampling(data=schools_dat, iter=1000, chains=4)

Solution

  • You have a few options ... best of which is pickle

    import pickle
    with open('fit.pkl', 'wb') as pickle_out:
        pickle.dump(fit, pickle_out)
    

    Another option is pandas ... but while this keeps the samples it is no longer a StanFit4Model object.

    import pandas as pd
    fit.to_dataframe().to_csv(‘fit.csv’, encoding='utf-8')