python optimization confidence-interval uncertainty pystan

Pystan Posterior Uncertainty Intervals

I saw on another forum that PyStan doesn’t have the same function as RStan where they use posterior_interval(), but we can use numpy.percentile() instead. I’m currently using the pystan.StanModel.optimizing() function in PyStan to get the set of parameters that maximize the posterior likelihood. I now would also like to obtain an outer 95% confidence interval for the posterior result, so I am wondering whether the numpy.percentile() function would be used with the optimizing function?

I tried finding the 95% interval of the parameter distribution, but that did not give a good confidence interval around the result. In particular, I don't consider it good since when I expect the posterior to present a multimodal distribution, the confidence interval I conduct using numpy.percentile() is inside the posterior 2-D gaussian patch.

I think the 95% interval would have to be taken from the posterior. Would I use the percentile function with the optimizing function to obtain a 95% confident posterior result?

Solution

To obtain bounds on posterior estimates requires sampling the posterior, which pystan.StanModel.optimizing does not do. Instead, use the pystan.StanModel.sampling method to generate MCMC draws from the posterior.

If you just need readouts of standard confidence bounds, then the pystan.StanFit.stansummary() method might be sufficient, as this will print the 2.5%, 25%, 50%, 75%, and 97.5% quantiles for each parameter. For example,

fit = sm.sampling(...) # eight schools model
print(fit.stansummary())

Inference for Stan model: anon_model_19a09b474d1901f191444eaf8a6b8ce2.
4 chains, each with iter=10000; warmup=5000; thin=1;  post-warmup
draws per chain=5000, total post-warmup draws=20000.

           mean se_mean     sd   2.5%    25%    50%   75%  97.5%   n_eff   Rhat 
mu         7.98    0.05   5.04   -2.0   4.76   7.91  11.2   18.2   10614    1.0 
tau        6.54    0.08   5.65   0.24   2.49   5.25   8.98  20.65   4552    1.0 
eta[0]     0.39  6.7e-3   0.94  -1.53  -0.23   0.42   1.02   2.18  20000    1.0 
eta[1]   3.3e-4  6.2e-3   0.88  -1.74  -0.58-2.5e-3   0.57   1.75  20000    1.0 
eta[2]     -0.2  6.6e-3   0.93  -2.01  -0.84  -0.22   0.41   1.68  20000    1.0 
eta[3]    -0.03  6.3e-3   0.89   -1.8  -0.61  -0.03   0.56   1.75  20000    1.0 
eta[4]    -0.35  6.7e-3   0.88  -2.04  -0.94  -0.36   0.22   1.44  17344    1.0 
eta[5]    -0.22  6.6e-3    0.9  -1.96  -0.81  -0.24   0.35   1.59  18298    1.0 
eta[6]     0.34  6.8e-3   0.88  -1.43  -0.23   0.36   0.93   2.04  16644    1.0 
eta[7]     0.05  6.6e-3   0.93  -1.77  -0.58   0.04   0.66   1.88  20000    1.0 
theta[0]   11.4    0.07   8.23  -1.83   6.04  10.22  15.42  31.52  13699    1.0 
theta[1]   7.93    0.04   6.21  -4.58   4.09   7.89  11.79  20.48  20000    1.0 
theta[2]   6.17    0.06   7.72 -11.43   2.06   6.65  10.85  20.53  16367    1.0 
theta[3]   7.72    0.05   6.53  -5.29   3.68    7.7  11.75  21.04  20000    1.0 
theta[4]   5.14    0.04   6.35  -9.35   1.44   5.64   9.38  16.49  20000    1.0 
theta[5]   6.11    0.05   6.66  -8.44   2.22   6.44  10.41  18.52  20000    1.0 
theta[6]  10.63    0.05   6.76  -1.25   6.04  10.08  14.51  25.66  20000    1.0 
theta[7]    8.4    0.06   7.85  -7.56   3.89   8.17  12.76   25.3  16584    1.0 
lp__      -4.89    0.04   2.63 -10.79  -6.47  -4.66  -3.01  -0.43   5355    1.0

Or if you need specific quantiles, you would use numpy.percentile as you mentioned.

However, as you correctly observe, this is inappropriate for multimodal distributions. This case is addressed in a different answer, but note that if one expects multiple modes a priori, a mixture model is typically used to separate the modes out into distinct unimodal random variables.