I have a data frame consisting of 3 groups each of which has multiple pre and post measurements. I would like to run perform paired t-tests for each pre and post measurement for each group.
Group| vo2_pre | vo2_post | vt_pre | vt_post
0 20 20 21 23
1 30 19 10 11
2 10 30 53 34
1 22 25 32 20
2 34 32 40 30
0 30 40 50 40
0 39 19 40 20
1 40 20 20 20
2 50 20 10 10
0 34 30 23 10
I can use the code below to get the p-values for pre and post for each group (0, 1, 2) for the vo2 variable. However, I would have to do this multiple times to get the pre and post comparisons for other variables i.e., vt_pre vs vt_post. I have a total of 28 variables (made up of 14 baseline measures and 14 follow up measures).
DLR_vo2tt <- DLR_df %>%
group_by(group) %>%
do(tidy(t.test(.$vo2_pre,
.$vo2_post,
mu = 0,
alt = "two.sided",
paired = TRUE,
conf.level = 0.99)))
My questions: is there a better way to do this so that I don't have to repeat the above code for each of the 14 pre and post pairs of variables?
The trick is to reshape your data to a longer format and then running the test. For example
library(dplyr)
library(tidyr)
library(broom)
DLR_df %>%
pivot_longer(-Group, names_to=c("test","stage"), names_sep="_") %>%
mutate(stage = factor(stage, levels=c("pre","post"))) %>%
group_by(Group, test) %>%
summarize(tidy(t.test(value~stage, data=cur_data(), paired=TRUE, conf.level=0.99)))
will return
Group test estimate statistic p.value parameter conf.low conf.high method alternative
<int> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 0 vo2 3.5 0.561 0.614 3 -32.9 39.9 Paired t-test two.sided
2 0 vt 10.2 2.23 0.112 3 -16.6 37.1 Paired t-test two.sided
3 1 vo2 9.33 1.39 0.298 2 -57.1 75.7 Paired t-test two.sided
4 1 vt 3.67 0.878 0.473 2 -37.8 45.1 Paired t-test two.sided
5 2 vo2 4 0.276 0.808 2 -140. 148. Paired t-test two.sided
6 2 vt 9.67 1.76 0.220 2 -44.8 64.1 Paired t-test two.sided