I'm struggling to translate codes from Stata to R. I want to do multilevel modeling with complex survey design in R, and I've done a lot of digging but I can't seem to find the perfect solution (or maybe I'm just not understanding codes).
Here's the Stata code:
svyset id_1, weight(wt_1) strata(strat_id)|| _n, weight(wt_2)
svy, subpop(subpopulation): melogit dep1 independent_vars || id_1: independent_vars2, or
In R, I've found BIFIEsurvey package, but I'm not sure if my code is parallel to Stata. Also there seems to be no option of subpopulation in R, so I'm wondering if there's alternative solutions.
model <- BIFIEsurvey::BIFIE.twolevelreg(BIFIEobj=data, dep = "dep1", formula.fixed=~ independent_vars, formula.random = ~ independent_vars2, idcluster = "strat_id", wgtlevel1 = "wt_1", wgtlevel2="wt_2", se = FALSE)
Thank you for any suggestions.
The R survey
package handles subpopulations automatically if you subset the survey design object, so there doesn't tend to be as much discussion of subpopulations in R. However, they would be important here.
BIFE::BIFIE.twolevelreg
only does linear mixed models, so it's not the same as svy: melogit
. The same is true of WeMix::mix
, and of my svylme::svy2lme
(which fits a wider range of linear mixed models). As far as I know, there's no package to do mixed effects logistic regression on survey data in R.
If you only need the regression coefficients and not the variance components, and you just want valid standard errors, just use survey::svyglm
Alternatively, if you have access to the variables used to select people for the survey and they are suitable explanatory variables for your model, just use lme4::glmer
or the brms
package or something and ignore the design.
If you have follow-up questions, they'd probably be more appropriate on CrossValidated rather than StackOverflow.