rfor-loopstatisticskruskal-wallis

How to loop Bartlett test and Kruskal tests for multiple columns in a dataframe?


I have the following dataframe I'm calling "test" and I am trying to run a Bartlett's test and a Kruskal-Wallis test for each "metab" vs the "diagnosis"

> test

Index   tube.label  age gender  diagnosis   metab1  metab2  metab3  metab4  metab5  metab6
1            200    73  Male    Cancer         6    1.5         2      5       8    1.5
2            201    71  Male    Healthy        6    1.5         2    11.5     50    1.5
4            202    76  Male    Adenoma        2    1.5         2      5       8    1.5
7            203    58  Female  Cancer         2    1.5         2    1.5     2.5    1.5
9            204    73  Male    Cancer         2    1.5         2    1.5       8    1.5
12           205    72  Male    Healthy        6    1.5    17.8272  13.5    184.2   4.5
13           206    46  Female  Cancer     30.0530  1.5        2    21.2    16.6    4.5
14           207    38  Female  Healthy        6    1.5        2    12.494  31.59   1.5
15           208    60  Male    Cancer         6    1.5        2    13.2    53.2    4.5
16           209    72  Female  Cancer         6    1.5        2    1.5        8    1.5
17           210    72  Male    Adenoma        6    1.5        2    22.829  102.44  9.069
18           211    52  Male    Cancer         6    1.5        2    1.5        8    1.5
19           212    64  Male    Healthy        6    1.5        2    1.5        8    1.5
20           213    68  Male    Cancer         6    1.5        2    26.685  40.9    4.5
21           214    60  Male    Healthy    24.902   1.5   42.443    22.942  498.5   4.5
23           215    70  Female  Healthy         6   1.5        2    1.5     19.908  4.5
24           216    42  Female  Healthy         6   1.5        2    1.5      17.7   1.5
25           217    72  Male    Inflammation    6   1.5        2    1.5         8   1.5
26           218    71  Male    Healthy        51   1.5        2    41.062  182.2   11.340
27           219    51  Female  Inflammation    2   1.5        2    1.5         8   1.5

I can run them individually and it gives me the proper value:

bartlett.test(metab1 ~ diagnosis, data = test)

    Bartlett test of homogeneity of variances

data:  metab1 by diagnosis
Bartlett's K-squared = 5.1526, df = 3, p-value = 0.161
kruskal.test(metab1 ~ diagnosis, data = test)

    Kruskal-Wallis rank sum test

data:  metab1 by diagnosis
Kruskal-Wallis chi-squared = 4.3475, df = 3, p-value = 0.2263

However when I try to run a for loop (I have more than 100 of them to run) I keep getting the following error:

Bartlett error:

testcols <- colnames(test[6:ncol(test)])
for (met in testcols){
  bartlett.test(met ~ diagnosis, data = test)
}

>Error in model.frame.default(formula = met ~ diagnosis, data = test) : 
  variable lengths differ (found for 'diagnosis')

Kruskal-Wallis error:

for(met in testcols){
  kruskal.test(met ~ diagnosis,data = test)
}

>Error in model.frame.default(formula = met ~ diagnosis, data = test) : 
  variable lengths differ (found for 'diagnosis')

Should I be using something else? Thank you for the help!


Solution

  • Try to create formula to apply using reformulate :

    cols <- names(test)[6:ncol(test)]
    
    all_test <- lapply(cols, function(x) 
                        bartlett.test(reformulate("diagnosis", x), data = test))
    

    You can do the same with kruskal.test.