I have a survey data, with categorical variables. I would like to compute the proportion for each responses in Q1, as well as generating the confidence interval or uncertainty around the estimate. Is that doable? The dataset already has sample weight generated. I am using R survey package but can't seem to get my head around it. An example table below Thank you
ID | Personweight | Q1 | Q2 |
---|---|---|---|
11 | 1.3 | 1 | 4 |
13 | 1.5 | 3 | 3 |
16 | 1.3 | 4 | 2 |
18 | 1.3 | 2 | 3 |
19 | 1.6 | 1 | 1 |
survey_design <- svydesign(ids = ~1, data = xxx, weights = ~Personweight)
proportions <- svyciprop(~Q1, survey_design)
#error seems to suggest that this function doesn't work for non-binary variable.
If you're happy with the ordinary +/- 1.96 standard error confidence intervals, you can just use svymean
and confint
.
Using one of the built-in examples:
> svymean(~stype, dclus2)
mean SE
stypeE 0.68118 0.0556
stypeH 0.13432 0.0567
stypeM 0.18450 0.0222
> confint(svymean(~stype, dclus2))
2.5 % 97.5 %
stypeE 0.57212715 0.7902345
stypeH 0.02312234 0.2455123
stypeM 0.14094122 0.2280625
If you want one of the other methods, you need to do some data wrangling. You could use svyciprop
on binary indicators
> svyciprop(~I(stype=="E"),dclus2,method="xlogit")
2.5% 97.5%
I(stype == "E") 0.681 0.560 0.782
or you could transform using svycontrast
, use confint
, and then transform back, which is what actually happens in svyciprop
> m<-svymean(~stype, dclus2)
> l<-svycontrast(m, quote(log(stypeE/(1-stypeE))))
> c<-confint(l,df=degf(dclus2))
> exp(c)/(1+exp(c))
2.5 % 97.5 %
contrast 0.5599558 0.782011
The only issue now is how to write the loop that does this for all levels. Perhaps something like
> f<-function(level) eval(bquote(
svyciprop(~I(stype==.(level)),dclus2,method="xlogit")
))
> lapply(c("E","M","H"),f)
[[1]]
2.5% 97.5%
I(stype == "E") 0.681 0.560 0.782
[[2]]
2.5% 97.5%
I(stype == "M") 0.185 0.144 0.234
[[3]]
2.5% 97.5%
I(stype == "H") 0.1343 0.0547 0.2939