I am trying to replicate an ESS (effective sample size) calculation using the method of Vehtari et al. in: Rank-normalization, folding, and localization: An improved Rhat for assessing convergence of MCMC
I am working from the code here: https://github.com/avehtari/rhat_ess/blob/master/code/monitornew.R
# Geyer's initial positive sequence
rho_hat_t <- rep.int(0, n_samples)
t <- 0
rho_hat_even <- 1
rho_hat_t[t + 1] <- rho_hat_even
rho_hat_odd <- 1 - (mean_var - mean(acov[t + 2, ])) / var_plus # 251
rho_hat_t[t + 2] <- rho_hat_odd
while (t < nrow(acov) - 5 && !is.nan(rho_hat_even + rho_hat_odd) &&
(rho_hat_even + rho_hat_odd > 0)) {
t <- t + 2
rho_hat_even = 1 - (mean_var - mean(acov[t + 1, ])) / var_plus # 256
rho_hat_odd = 1 - (mean_var - mean(acov[t + 2, ])) / var_plus # 257
if ((rho_hat_even + rho_hat_odd) >= 0) {
rho_hat_t[t + 1] <- rho_hat_even
rho_hat_t[t + 2] <- rho_hat_odd
}
}
I can follow the code from the paper except when we get to equation 10 in the paper (calculating the cross-chain autocorrelation). The code (lines 251, 256 and 257) appears in the form:
1 - (mean_var - mean(acov[t + 1, ])) / var_plus
which is close to equation 10, except the missing the 's' terms from equation 10:
I can't see anywhere in the code that this is somehow accounted for elsewhere in the way the calculation is being done. I have tried putting the 's' terms back into those lines of code and it makes a big difference to the final ESS value.
Is anyone able to help me understand the discrepancy between paper and code?
Thanks.
In the formula in the paper, s^2
is is the estimate of variance and rho
the estimate of autocorrelation. Thus s^2 * rho
is an estimate of the autocovariance, which is what you see in the code.