I am running a function for multiple weighted t-tests on different subsets of a dataframe. My function is essentially the following:
library(weights)
group_list <- list(unique(df$group))
t_tests <- for (g in group_list){wtd.t.test(x=df[df$group == g,]$var2[df[df$group == g,]$var1=="A"],y=df[df$group == g,]$var2[df[df$group == g,]$var1=="B"],
weight=df[df$group == g,]$weight[df[df$group == g,]$var1=="A"],weighty=df[df$group == g,]$weight[df[df$group == g,]$var1=="B"],samedata=FALSE)}
Where var2
is the outcome variable of interest. I want to test the significance of the difference between means of var1
= "A" and var1
= "B", and perform this for each subset of the data for the different value of the variable group
.
I used the above code, but the error is Error in wtd.t.test(x = df[df$group == g, : object 'out' not found
Have I improperly structured the function? How can I make this weighted t test run for every subset of the dataframe?
UPDATE: New approach using nested tibbles as suggested
My new approach is the following:
library(weights)
library(tidyverse)
df %>%
nest(-group) %>%
mutate(fit = map(data, ~ wtd.t.test(x=.%>%filter(var1 == "A")$var2,y=.%>% filter(var1 == "B")$var2,
weight=.%>% filter(var1 == "A")$weight,weighty=.%>% filter(var1 == "B")$weight,samedata=FALSE)),
results = map(fit, glance)) %>%
unnest(results)
The new error message is:
Error in `mutate()`:
ℹ In argument: `fit = map(...)`.
Caused by error in `map()`:
ℹ In index: 1.
Caused by error in `weight / mean(weight, na.rm = TRUE)`:
! non-numeric argument to binary operator
Backtrace:
1. ... %>% unnest(results)
10. purrr::map(...)
11. purrr:::map_("list", .x, .f, ..., .progress = .progress)
15. .f(.x[[i]], ...)
16. weights::wtd.t.test(...)
All of my variables are numeric, other than Var1, which is not used in calculations, so I am unclear why this error message emerges. Any suggestions would be much appreciated.
If I reformat the code as the following:
df %>%
nest(-country) %>%
mutate(fit = map(data, ~ wtd.t.test(x=filter(.,var1 == "A")$var2,y=filter(.,var1 == "B")$var2,
weight=filter(.,var1 == "A")$weight,weighty=filter(.,var1 == "B")$weight,samedata=FALSE)),
results = map(fit, glance)) %>%
unnest(results)
Now the error becomes:
Error in `mutate()`:
ℹ In argument: `fit = map(...)`.
Caused by error in `map()`:
ℹ In index: 1.
Caused by error in `wtd.t.test()`:
! object 'out' not found
Backtrace:
1. ... %>% unnest(results)
10. purrr::map(...)
11. purrr:::map_("list", .x, .f, ..., .progress = .progress)
15. .f(.x[[i]], ...)
16. weights::wtd.t.test(...)
UPDATE 2
Here is the new code updated with a reproducible example:
library(weights)
library(tidyverse)
mtcars %>%
nest(-cyl) %>%
mutate(fit = map(data, ~ wtd.t.test(x=.%>%filter(gear == 3)$disp,y=.%>% filter(gear = 4)$disp,
weight=.%>% filter(gear == 3)$wt,weighty=.%>% filter(gear == 4)$wt,samedata=FALSE)),
results = map(fit, glance)) %>%
unnest(results)
and reformatted:
mtcars %>%
nest(-cyl) %>%
mutate(fit = map(data, ~ wtd.t.test(x=filter(.,gear == 3)$disp,y=filter(.,gear == 4)$disp,
weight=filter(.,gear == 3)$weight,weighty=filter(.,gear == 4)$weight,samedata=FALSE)),
results = map(fit, glance)) %>%
unnest(results)
For those interested, a solution (using the mtcars
dataset as example data) is the following:
library(tidyverse)
library(weights)
df_list <- split(mtcars, mtcars$cyl)
multiple_wt_ttest <- function(df) {ttest = wtd.t.test(x=subset(df, gear == 3)$disp,y=subset(df, gear == 4)$disp,
weight=subset(df, gear == 3)$wt,weighty=subset(df, gear == 4)$wt,samedata=FALSE)
out <<- ttest[2]}
data_store <- do.call(rbind, sapply(df_list,multiple_wt_ttest))
Which yields a dataframe with the t-test test statistics for each subset of the data for each level of cyl
.