pythonpymc3stanrstanpystan

Can you use sample weights in pystan or pymc3?


If my observed dataset has weights (for example tracking multiplicity) is it possible to provide this either to pystan or pymc3, similar to the function signature (http://mc-stan.org/rstanarm/reference/stan_glm.html) in the rstanarm package:

stan_glm(formula, family = gaussian(), data, weights, subset,
  na.action = NULL, offset = NULL, model = TRUE, x = FALSE, y = TRUE,
  contrasts = NULL, ..., prior = normal(), prior_intercept = normal(),
  prior_aux = exponential(), prior_PD = FALSE, algorithm = c("sampling",
  "optimizing", "meanfield", "fullrank"), adapt_delta = NULL, QR = FALSE,
  sparse = FALSE)

Solution

  • With Stan (in any of its interfaces, including PyStan), you can introduce weights within the model. For example, in a linear regression, that'd be e.g., instead of y[i] ~ normal(mu[i], sigma) you use target += weight[i] * normal_lpdf(y[i] | mu[i], sigma).

    This gives you a well specified density if the weights are positive. We tend to prefer generative approaches.