I have a df in R with 50 unique combinations of A and B.
For each combination of A and B, I want to perform a Kruskal-wallis test: kruskal.test(D,C,data = df)
I want to test which combinations A and B needed to reject the null hypothesis.
How can i perform this without making a seperate test for each combination? Sample of my data is below
A B C D
mix1 size1 1 0.2
mix1 size1 2 0.15
mix1 size1 3 0.22
mix1 size1 4 0.215
mix2 size1 1 0.2
mix2 size1 2 0.15
mix2 size1 3 0.2
mix2 size1 4 0.15
mix2 size2 1 0.21
mix2 size2 2 0.11
mix2 size2 3 0.23
mix2 size2 4 0.615
...
mix22 size1 1 0.01
mix22 size1 2 0.18
mix22 size1 3 0.7
mix22 size1 4 0.17
My expected output is df/table with the p-value from the kruskal-wallis test of each combination of A and B.
A B P
mix1 size1 0.005
mix2 size1 0.211
Perhaps with something from the *apply
family?
Here's a tidyverse
approach using dplyr
and rstatix
. I used the data posted by @jay.sf.
library(dplyr)
library(rstatix)
df |>
group_by(A,B) |>
kruskal_test(D~C)
# A tibble: 4 × 8
# A B .y. n statistic df p method
#* <chr> <chr> <chr> <int> <dbl> <int> <dbl> <chr>
#1 mix1 size1 D 4 3 3 0.392 Kruskal-Wallis
#2 mix2 size1 D 4 3 3 0.392 Kruskal-Wallis
#3 mix2 size2 D 4 3 3 0.392 Kruskal-Wallis
#4 mix22 size1 D 4 3 3 0.392 Kruskal-Wallis