rkruskal-wallis

Perform Kruskal-wallis test for big amount of combinations in a dataframe


I have a df in R with 50 unique combinations of A and B. For each combination of A and B, I want to perform a Kruskal-wallis test: kruskal.test(D,C,data = df)

I want to test which combinations A and B needed to reject the null hypothesis.

How can i perform this without making a seperate test for each combination? Sample of my data is below

A     B       C     D
mix1 size1    1     0.2
mix1 size1    2     0.15
mix1 size1    3     0.22
mix1 size1    4     0.215
mix2 size1    1     0.2
mix2 size1    2     0.15
mix2 size1    3     0.2
mix2 size1    4     0.15
mix2 size2    1     0.21
mix2 size2    2     0.11
mix2 size2    3     0.23
mix2 size2    4     0.615
...
mix22 size1    1     0.01
mix22 size1    2     0.18
mix22 size1    3     0.7
mix22 size1    4     0.17

My expected output is df/table with the p-value from the kruskal-wallis test of each combination of A and B.

A     B    P
mix1 size1 0.005
mix2 size1 0.211

Perhaps with something from the *apply family?


Solution

  • Here's a tidyverse approach using dplyr and rstatix. I used the data posted by @jay.sf.

    library(dplyr)
    library(rstatix)
    
    df |>
      group_by(A,B) |>
      kruskal_test(D~C)
    
    # A tibble: 4 × 8
    #  A     B     .y.       n statistic    df     p method        
    #* <chr> <chr> <chr> <int>     <dbl> <int> <dbl> <chr>         
    #1 mix1  size1 D         4       3       3 0.392 Kruskal-Wallis
    #2 mix2  size1 D         4       3       3 0.392 Kruskal-Wallis
    #3 mix2  size2 D         4       3       3 0.392 Kruskal-Wallis
    #4 mix22 size1 D         4       3       3 0.392 Kruskal-Wallis