R partykit: How do I use the offset?

I am trying to predict the frequency of an outcome and I have a lot of data. I have already fitted a glm to the data and now I am trying to use ctree to understand any complex interaction in the dataset that I may have missed.

Instead of directly predicting the residual, I have tried to offset the ctree model to the glm prediction. However, I seem to get the same results when I: (a) use no offset at all, (b) specify the offset in the function, and (c) use the offset in the ctree equation.

I have tried looking at the documentation(here and here) but I have not found it helpful.

I have created some dummy data to mimic what I am doing:

library(partykit)

# Set random number seed
set.seed(15)

# Create Dataset
freq <- rpois(10000, 1.2)
example_df <- data.frame(var_1 = rnorm(10000, 180, 20) * freq / 10,
                        var_2 = runif(10000, 1, 8),
                        var_3 = runif(10000, 1, 2.5) + freq / 1000)
example_df$var_4 = example_df$var_1 * example_df$var_3 + rnorm(10000, 0.1, 0.5)
example_df$var_5 = example_df$var_2 * example_df$var_3 + rnorm(10000, 2, 50)

# Create GLM
base_mod <- glm(freq ~ ., family="poisson", data=example_df)
base_pred <- predict(base_mod)

# Create trees
exc_offset <- ctree(freq ~ ., data = example_df, control = ctree_control(alpha = 0.01, minbucket = 1000))
func_offset <- ctree(freq ~ ., data = example_df, offset = base_pred, control = ctree_control(alpha = 0.01, minbucket = 1000))
equ_offset <- ctree(freq ~ . + offset(base_pred), data = example_df, control = ctree_control(alpha = 0.01, minbucket = 1000))

I expected the outcomes of the trees to be different when the offset is included from when the offset isn't included. However, the outputs seem to be the same:

# Predict outcomes
summary(predict(exc_offset, example_df))
summary(predict(func_offset, example_df))
summary(predict(equ_offset, example_df))

# Show trees
exc_offset
func_offset
equ_offset

Does anyone know what is going on? Have should I use the offsets?

Solution

The ctree() algorithm is not based on a linear predictor and hence including an offset is not possible out-of-the-box. It is possible to include an offset by using a model-based ytrafo score, though. See vignette("ctree", package = "partykit") for more details (also available on CRAN at https://CRAN.R-project.org/web/packages/partykit/vignettes/ctree.pdf).

However, the more natural solution is to use a GLM model-based tree with the glmtree() function. I think you try to fit this tree:

glmtree(freq ~ ., data = example_df, offset = base_pred, family = poisson,
  alpha = 0.01, minsize = 1000)

See vignette("mob", package = "partykit") for more details (also available on CRAN at https://CRAN.R-project.org/web/packages/partykit/vignettes/mob.pdf).

But rather than estimating the offset once and then the tree once, it is also easily possible to iterate this process to obtain a better fit. We called this PALM tree (partially additive linear tree), available in the palmtree package (https://doi.org/10.1007/s11634-018-0342-1).

Finally, I would encourage you to explore which of the available covariates is used as:

regressors in the offset (global regressors)
regressors in each node (local regressors)
splitting variables

Possibly, the resulting model might be more interpretable when the right parts for each covariate.