I have a linear model of a system, for which I don't currently have any data. It could be of a form Y = a + b + c
. I would like to simulate likely values of Y
based on assumptions about the distribution of the input parameters. It could be that a
is distributed as Normal(mu=-1, sd=0.1)
, b
as Normal(mu=15.0, sd=2.0)
and c
is a constant. I have tried to implement such a simulation using pyMC3 as follows:
import arviz as az
import matplotlib.pyplot as plt
import pymc3 as pm
c = 2.0
with pm.Model() as my_model:
a = pm.Normal("a", -1, 0.1)
b = pm.Normal("b", 15.0, 2.0)
Y_mu = a + b + c
Y_sigma = pm.Normal('Y_sigma', mu=1, sd=1)
Y = pm.Normal("Y", mu=Y_mu, sigma = Y_sigma)
prior_checks = pm.sample_prior_predictive(samples=10000, random_seed=123)
fig, axes = plt.subplots(ncols=1, nrows = len(prior_checks))
for i, (key, value) in enumerate(prior_checks.items()):
axes[i].set_title(key)
az.plot_kde(value, ax=axes[i])
I have been unable to find an approach, where I can avoid making an assumption about the distribution of Y
, e.g. I cannot get values for what I define as Y_mu
above. I would like to not assume a standard deviation for Y
and just see what values are generated for Y = a + b + c
.
Am I approaching this the right way? Or am I missing a simple detail, and overcomplicating things?
You have to use pm.Deterministic
to store Y_mu
as a variable in the trace:
c = 2.0
with pm.Model() as my_model:
a = pm.Normal("a", -1, 0.1)
b = pm.Normal("b", 15.0, 2.0)
Y_mu = pm.Deterministic("Y_mu", a + b + c)
Y_sigma = pm.Normal('Y_sigma', mu=1, sd=1)
Y = pm.Normal("Y", mu=Y_mu, sigma = Y_sigma)
prior_checks = pm.sample_prior_predictive(samples=10000, random_seed=123)