I have a data frame 90 observations and 124306 variables named KWR all numeric data. I want to run a Kruskal Wallis analysis within every column between groups. I added a vector with every different group behind my variables named "Group". To test the accuracy, I tested one peptide (named x2461) with this code:
kruskal.test(X2461 ~ Group, data = KWR)
Which worked out fine and got me a result instantly. However, I need all the variables to be analyzed. I used lapply while reading this post: How to loop Bartlett test and Kruskal tests for multiple columns in a dataframe?
cols <- names(KWR)[1:124306]
allKWR <- lapply(cols, function(x) kruskal.test(reformulate("Group", x), data = KWR))
However, after 2 hours of R working non stop, I quit the job. Is there any more efficient way of doing this?
Thanks in advance.
NB: first time poster, beginner in R
Take a look at kruskaltests
in the Rfast
package. For the KWR
data.frame
, it appears it would be something like:
allKWR <- Rfast::kruskaltests(as.matrix(KWR[,1:124306]), as.numeric(as.factor(KWR$Group)))