Briefly, I am working with data sets from two different countries. My aim is to ensemble the models for both countries to see how generalizable the ensemble becomes
My set-up is: I have trained one worfklow_set for each country (10 model specifications with resampling and a grid search of size 20).
This is the error I get when trying to add them as candidates:
predictions <- stacks() %>%
add_candidates(wf_set_1) %>%
add_candidates(wf_set_2)
Error: It seems like the new candidate member 'Logistic Regression' doesn't make use of the same resampling object as the existing candidates.
Thanks for the question!
Unfortunately, we don't support ensembling models trained on different data sets in stacks. There are a few operations that are no longer well-defined when this is the case.
Given your description of the problem, though, this sounds like a setting where, rather than fitting a model for each country, the country would be included as a feature in one model that fits across countries. For any covariates x_i
whose effect you feel may be dependent on country, you can create an interaction term with step_interact(x_i, country)
.