I am trying to build a linear model in Pymc3 that uses age and age*sex interaction term to model some output variable. However, since sex is a [0, 1] categorical variable, the model can't effectively find both cov1_beta and cov2_beta. Any help is appreciated, thank you.
with pm.Model() as model_interaction:
mu = pm.Normal("a", mu=0, sd=1)
cov1_beta = pm.Normal("cov1_age", mu=0, sd=10, shape=1)
cov2_beta = pm.Normal("cov2_age_sex", mu=0, sd=10, shape=2)
mu = mu + cov1_beta*Age_mean_scaled + cov2_beta[Sex_w]*Age_mean_scaled
# Model error
eps = pm.HalfCauchy('eps',20)
# Data likelihood
mdl_lkl = pm.Normal('model', mu=mu, sd=eps, observed=X)
I'm not very familiar with PyMC3, but I can give a you few general pointers: You get the interaction of two variables by multiplying the vectors of the two variables and estimating a beta for the result. So for the interaction of age and sex, you multiply the vectors of age and sex and estimate only one beta for the interaction. You're also redefining mu
in the model, which I would avoid because I don't know how PyMC3 deals with it. So assuming you have two vectors age
and sex
, your model should look like this:
age_sex = age * sex # elementwise product
with pm.Model() as model_interaction:
intercept = pm.Normal("intercept", mu=0, sd=1)
beta_age = pm.Normal("beta_age", mu=0, sd=10, shape=1)
beta_age_sex = pm.Normal("beta_age_sex", mu=0, sd=10, shape=1)
mu = intercept + beta_age * age + beta_age_sex * age_sex
eps = pm.HalfCauchy('eps',20)
mdl_lkl = pm.Normal('model', mu=mu, sd=eps, observed=X)