Difference in results between glmmTMB and zeroinf (in pscl)

I applied glmmTMB and zeroinfl in pscl to the same dataset. I obtained the identical coefficients for the conditional part, but the coefficients for the binary part are somewhat different. Any idea on the potential factors that makes the difference?

Thanks!

Here's the result from zeroinfl:

# > summary(pscl.res)
# 
# Call:
#   zeroinfl(formula = outside_treatment ~ group + baseline.risk + Age.group + 
#              offset(log(Follow.up)) | group, data = final, dist = "negbin", 
#            link = "logit", trace = TRUE)
# 
# Pearson residuals:
#     Min      1Q  Median      3Q     Max 
# -1.0332 -0.7003 -0.4041  0.1675 12.5728 

# Count model coefficients (negbin with log link):
#                Estimate Std. Error z value Pr(>|z|)    
# (Intercept)    -2.39168    0.13282 -18.007  < 2e-16 ***
# group1         -0.22475    0.09833  -2.286   0.0223 *  
# baseline.risk2 -0.20385    0.12947  -1.575   0.1154    
# baseline.risk3 -0.67367    0.12422  -5.423 5.85e-08 ***
# Age.group1     -0.65774    0.11554  -5.693 1.25e-08 ***
# Log(theta)      0.12366    0.07302   1.693   0.0904 .

# Zero-inflation model coefficients (binomial with logit link):
#             Estimate Std. Error z value Pr(>|z|)
# (Intercept)  -16.564    389.991  -0.042    0.966
# group1        -2.743   1909.909  -0.001    0.999
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
# 
# Theta = 1.1316 
# Number of iterations in BFGS optimization: 35 
# Log-likelihood: -1531 on 8 Df

And here's the result from glmmTBM:

# > summary(true.zinb)
# Family: nbinom2  ( log )
# Formula:          
# outside_treatment ~ group + baseline.risk + Age.group + offset(log(Follow.up))
# Zero inflation:                     ~1 + group
# Data: final
# 
# AIC      BIC   logLik deviance df.resid 
# 3078.1   3110.5  -1531.1   3062.1      414 
# 
# 
# Dispersion parameter for nbinom2 family (): 1.13 
# 
# Conditional model:
#                Estimate Std. Error z value Pr(>|z|)    
# (Intercept)    -2.39168    0.13282 -18.007  < 2e-16 ***
# group1         -0.22475    0.09833  -2.286   0.0223 *  
# baseline.risk2 -0.20385    0.12947  -1.575   0.1154    
# baseline.risk3 -0.67367    0.12422  -5.423 5.85e-08 ***
# Age.group1     -0.65774    0.11554  -5.693 1.25e-08 ***
#   ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Zero-inflation model:
#              Estimate Std. Error z value Pr(>|z|)
# (Intercept)  -20.8282  3289.0961  -0.006    0.995
# group1         0.4139  4626.1083   0.000    1.000

Solution

tl;dr The estimates of zero-inflation probabilities are practically zero. In this limit, these differences are driven by numerical approximation differences (and are very small on the scale of predicted probabilities). You would likely get nearly identical model fits (i.e. very similar log-likelihoods and predicted values) if you refitted the models without zero-inflation.

The zero-inflation component in the reference condition (presumably group0) is extremely small, effectively zero; the true maximum likelihood estimate is presumably -Inf, but -16 or -20 (depending on the package) is where the likelihood surface gets flat enough that the optimizer concludes that it has found an optimum. (plogis(-16) corresponds to a z-i probability of 1e-7, plogis(-20) to 2e-9). The group1 estimates are deviations from the group0 value, but lead to logit-scale estimates of -18 or so. You'll also notice that the standard deviations for these estimates are all gigantic, an indication that the Wald approximation has broken down.

If you run diagnose() on the glmmTMB model it will flag these large effects and give a brief explanation, something along the lines of

Large negative coefficients in zi (log-odds of zero-inflation), dispersion, or random effects (log-standard deviations) suggest unnecessary components (converging to zero on the constrained scale) ...