I have a dataframe of 27 columns (26 are numeric variables and the 27th column tells me which group each row is associated with). There are 7 groups in total I'm trying to apply the Kruskal-Wallis test to each variable, split by group, to determine if there is a significant difference or not.
I have tried:
df.groupby(['treatment']).apply(kruskal)
which throws an error "Need at least 2 groups two groups in stats.kruskal()".
My other attempts haven't produced an output either. I'll be doing similar analyses on a regular basis and with larger datasets. Can someone help me understand this issue and how to fix it?
With Scipy, you could do like that for each variable:
scipy.stats.kruskal(*[group["variable"].values for name, group in df.groupby("treatment")])