In R, I have a dataframe (mydf) that looks like:
weight | gender | var1 | var2 |
---|---|---|---|
100 | M | 1 | 3 |
800 | F | 2 | 8 |
10 | F | 2 | 9 |
150 | F | 4 | 10 |
(But with 100 columns (var3, var 4 etc) and 2000 rows).
I want to calculate the weighted frequency and descriptive statistics for each variable ("var" columns) grouped by gender. On the un-grouped data, I have used the summarytools package calculate the frequency and descriptive statistics (freq and descr functions) and that's worked fine. My code was:
## generate descriptive stats and specify weight
mydf_descr <- descr(mydf, weights = mydf$weight)
## generate frequency tables and specify weight
mydf_freq <- freq(mydf, weights = mydf$weight)
However, when I try to apply grouping I'm getting errors. My code is:
mydf_descr_gender <- mydf %>%
group_by(gender) %>%
descr(., weights = mydf$weight)
However, I got the error:
Error in descr(x = as_tibble(var_obj)[gr_inds[[g]], ], stats = stats, :
weights vector must have same length as 'x'
And I'm getting the same thing for the freq function.
I also tried:
mdf_freq_gen <- mydf %>%
group_by(gender) %>%
summarise_all(~ freq(., weights = weight))
And got the error
Error in `summarise()`:
ℹ In argument: `var1 = (structure(function (..., .x = ..1, .y = ..2, . = ..1) ...`.
ℹ In group 1: `gender = 1`.
Caused by error in `freq()`:
! weights vector must have same length as 'x'
Run `rlang::last_trace()` to see where the error occurred.
I've tried a bunch of stuff but I can't seem to get it to run the function when grouped and include the weights (it works fine without the weighting). I'm sure I'm missing something obvious!
Any help/ideas would be much appreciated!
You can split the dataset and apply the function to each subset.
library(dplyr)
library(summarytools)
mydf %>%
split(.$gender) %>%
purrr::map(~descr(.x, weights = .x$weight))
The same can also be achieved with group_map
.
mydf %>%
group_by(gender) %>%
group_map(~descr(.x, weights = .x$weight))