I'm trying to create a unified interface for running a t-test using {rlang}, and I'd like to use the formula method for t.test()
. While doing so, I can construct a formula using {rlang} for unpaired t-test, but can't figure out how to create the formula for paired test.
library(rlang)
two_sample_t_test <- function(data, x, y, paired, ...) {
x <- ensym(x)
y <- ensym(y)
exec(
t.test,
formula = if (paired) {
new_formula(quote(Pair(x, y)), 1)
} else {
new_formula(y, x)
},
data = data,
...
)
}
# unpaired t-test -------------------------
two_sample_t_test(mtcars, am, wt, paired = FALSE, var.equal = TRUE)
#>
#> Two Sample t-test
#>
#> data: wt by am
#> t = 5.2576, df = 30, p-value = 1.125e-05
#> alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
#> 95 percent confidence interval:
#> 0.8304317 1.8853577
#> sample estimates:
#> mean in group 0 mean in group 1
#> 3.768895 2.411000
# paired t-test -------------------------
two_sample_t_test(mtcars, am, wt, paired = TRUE)
#> Error in model.frame.default(formula = .Primitive("quote")(Pair(x, y) ~ : invalid type (list) for variable 'Pair(x, y)'
Created on 2023-09-11 with reprex v2.0.2
In {rlang} we can use expr()
to capture the expression inside new_formula
and !!
to evaluate x
and y
early, which should give us the desired result (the output is much nicer than my old answer using eval(bquote(...))
, see below):
library(rlang)
two_sample_t_test <- function(data, x, y, paired, ...) {
x <- ensym(x)
y <- ensym(y)
exec(
t.test,
formula = if (paired) {
new_formula(expr(Pair(!! x, !! y)), 1)
} else {
new_formula(y, x)
},
data = data,
...
)
}
two_sample_t_test(mtcars, am, wt, paired = TRUE)
#>
#> Paired t-test
#>
#> data: Pair(am, wt)
#> t = -11.589, df = 31, p-value = 8.457e-13
#> alternative hypothesis: true mean difference is not equal to 0
#> 95 percent confidence interval:
#> -3.305685 -2.316315
#> sample estimates:
#> mean difference
#> -2.811
two_sample_t_test(mtcars, am, wt, paired = FALSE, var.equal = TRUE)
#>
#> Two Sample t-test
#>
#> data: wt by am
#> t = 5.2576, df = 30, p-value = 1.125e-05
#> alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
#> 95 percent confidence interval:
#> 0.8304317 1.8853577
#> sample estimates:
#> mean in group 0 mean in group 1
#> 3.768895 2.411000
I'd try to use eval(bquote(...))
instead of quote()
alone, then evaluate x
and y
early by wrapping them in .()
and use the data.frame
supplied in the data
arg as envir
for eval
.
library(rlang)
two_sample_t_test <- function(data, x, y, paired, ...) {
x <- ensym(x)
y <- ensym(y)
exec(
t.test,
formula = if (paired) {
new_formula(eval(bquote(Pair(.(x), .(y))), envir = data), 1)
} else {
new_formula(y, x)
},
data = data,
...
)
}
two_sample_t_test(mtcars, am, wt, paired = FALSE, var.equal = TRUE)
#>
#> Two Sample t-test
#>
#> data: wt by am
#> t = 5.2576, df = 30, p-value = 1.125e-05
#> alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
#> 95 percent confidence interval:
#> 0.8304317 1.8853577
#> sample estimates:
#> mean in group 0 mean in group 1
#> 3.768895 2.411000
two_sample_t_test(mtcars, am, wt, paired = TRUE)
#>
#> Paired t-test
#>
#> data: structure(c(1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2.62, 2.875, 2.32, 3.215, 3.44, 3.46, 3.57, 3.19, 3.15, 3.44, 3.44, 4.07, 3.73, 3.78, 5.25, 5.424, 5.345, 2.2, 1.615, 1.835, 2.465, 3.52, 3.435, 3.84, 3.845, 1.935, 2.14, 1.513, 3.17, 2.77, 3.57, 2.78), dim = c(32L, 2L), dimnames = list(NULL, c("x", "y")), class = "Pair")
#> t = -11.589, df = 31, p-value = 8.457e-13
#> alternative hypothesis: true mean difference is not equal to 0
#> 95 percent confidence interval:
#> -3.305685 -2.316315
#> sample estimates:
#> mean difference
#> -2.811
Created on 2023-09-11 with reprex v2.0.2