I am grouping my data as below:
all_groups = df.groupby('age').groups
Printing all_groups
shows:
{1.0: [11, 14, 15, 22], 2.0: [12, 13, 27], 3.0: [16, 17, 19, 20, 23, 24],
6.0: [21], 7.0: [18, 25, 26], 11.0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]}
Now I want to run stats.mannwhitneyu
on all possible combinations of two classes. In this example, I have 6 groups, therefor, 15 combinations are possible, e.g., stats.mannwhitneyu(class1, class2), stats.mannwhitneyu(class1, class3), ..., stats.mannwhitneyu(class7, class11)
.
I need a general approach to do it, specially that I don't know the number of classes in advance. What is the cleanest/smartest way to do it? Thank you in advance.
You could compute a GroupBy
object, then apply your test on all combinations
:
from itertools import combinations
from scipy.stats import mannwhitneyu
groups = df.groupby('age')['value']
out = pd.DataFrame.from_dict({(a[0], b[0]): mannwhitneyu(a[1], b[1])
for a, b in combinations(groups, 2)},
orient='index')
Example:
statistic pvalue
(0, 1) 17.0 0.939860
(0, 2) 14.0 1.000000
(0, 3) 61.0 0.205667
(0, 4) 28.0 0.757692
(0, 5) 20.0 0.797203
... ... ...
(16, 18) 8.0 1.000000
(16, 19) 13.0 0.380952
(17, 18) 17.0 0.420635
(17, 19) 21.0 0.329004
(18, 19) 18.0 0.662338
[190 rows x 2 columns]
Used input:
np.random.seed(0)
df = pd.DataFrame({'age': np.random.randint(0, 20, 100),
'value': np.random.random(100)
})
If you want a square matrix of pvalues as output, using squareform
:
from scipy.spatial.distance import squareform
idx = sorted(df['age'].unique())
out = pd.DataFrame(squareform([mannwhitneyu(a[1], b[1]).pvalue
for a, b in combinations(groups, 2)]),
index=idx, columns=idx).sort_index().sort_index(axis=1)
Output:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
0 0.000000 0.939860 1.000000 0.205667 0.757692 0.797203 0.297702 0.330070 0.863636 0.035964 0.260140 0.727273 0.148252 1.000000 0.898102 0.114161 1.000000 0.898102 0.699301 0.528671
1 0.939860 0.000000 0.857143 0.187812 0.787879 0.730159 0.285714 0.485714 0.857143 0.066667 0.485714 1.000000 0.200000 1.000000 0.904762 0.163636 1.000000 0.555556 1.000000 0.609524
2 1.000000 0.857143 0.000000 0.286713 0.666667 0.571429 0.250000 1.000000 1.000000 0.095238 0.400000 1.000000 0.228571 0.800000 1.000000 0.266667 1.000000 0.392857 1.000000 0.714286
3 0.205667 0.187812 0.286713 0.000000 0.193233 0.055278 0.953047 0.733267 0.468531 0.313187 0.839161 0.216783 0.023976 0.363636 0.439560 0.417318 0.216783 0.055278 0.206460 0.792458
4 0.757692 0.787879 0.666667 0.193233 0.000000 0.876263 0.267677 0.315152 1.000000 0.073427 0.230303 0.833333 0.315152 0.888889 0.530303 0.164918 1.000000 1.000000 0.431818 0.533800
5 0.797203 0.730159 0.571429 0.055278 0.876263 0.000000 0.150794 0.190476 1.000000 0.017316 0.063492 0.785714 0.555556 0.857143 0.690476 0.106061 1.000000 1.000000 0.309524 0.246753
6 0.297702 0.285714 0.250000 0.953047 0.267677 0.150794 0.000000 0.555556 0.571429 0.428571 1.000000 0.250000 0.063492 0.380952 0.309524 0.755051 0.392857 0.095238 0.222222 0.930736
7 0.330070 0.485714 1.000000 0.733267 0.315152 0.190476 0.555556 0.000000 0.400000 0.114286 0.685714 0.857143 0.028571 0.800000 0.412698 0.527273 0.400000 0.111111 0.555556 0.914286
8 0.863636 0.857143 1.000000 0.468531 1.000000 1.000000 0.571429 0.400000 0.000000 0.166667 0.628571 1.000000 0.400000 1.000000 0.571429 0.266667 1.000000 1.000000 1.000000 0.904762
9 0.035964 0.066667 0.095238 0.313187 0.073427 0.017316 0.428571 0.114286 0.166667 0.000000 0.609524 0.047619 0.009524 0.285714 0.051948 0.730769 0.166667 0.004329 0.051948 0.240260
10 0.260140 0.485714 0.400000 0.839161 0.230303 0.063492 1.000000 0.685714 0.628571 0.609524 0.000000 0.228571 0.057143 0.533333 0.412698 0.927273 0.400000 0.111111 0.285714 0.761905
11 0.727273 1.000000 1.000000 0.216783 0.833333 0.785714 0.250000 0.857143 1.000000 0.047619 0.228571 0.000000 0.228571 1.000000 0.785714 0.266667 1.000000 0.571429 1.000000 0.714286
12 0.148252 0.200000 0.228571 0.023976 0.315152 0.555556 0.063492 0.028571 0.400000 0.009524 0.057143 0.228571 0.000000 0.533333 0.063492 0.024242 0.628571 0.285714 0.063492 0.171429
13 1.000000 1.000000 0.800000 0.363636 0.888889 0.857143 0.380952 0.800000 1.000000 0.285714 0.533333 1.000000 0.533333 0.000000 1.000000 0.333333 0.800000 0.857143 1.000000 0.642857
14 0.898102 0.904762 1.000000 0.439560 0.530303 0.690476 0.309524 0.412698 0.571429 0.051948 0.412698 0.785714 0.063492 1.000000 0.000000 0.343434 0.785714 0.841270 0.841270 0.792208
15 0.114161 0.163636 0.266667 0.417318 0.164918 0.106061 0.755051 0.527273 0.266667 0.730769 0.927273 0.266667 0.024242 0.333333 0.343434 0.000000 0.266667 0.073232 0.202020 0.365967
16 1.000000 1.000000 1.000000 0.216783 1.000000 1.000000 0.392857 0.400000 1.000000 0.166667 0.400000 1.000000 0.628571 0.800000 0.785714 0.266667 0.000000 1.000000 1.000000 0.380952
17 0.898102 0.555556 0.392857 0.055278 1.000000 1.000000 0.095238 0.111111 1.000000 0.004329 0.111111 0.571429 0.285714 0.857143 0.841270 0.073232 1.000000 0.000000 0.420635 0.329004
18 0.699301 1.000000 1.000000 0.206460 0.431818 0.309524 0.222222 0.555556 1.000000 0.051948 0.285714 1.000000 0.063492 1.000000 0.841270 0.202020 1.000000 0.420635 0.000000 0.662338
19 0.528671 0.609524 0.714286 0.792458 0.533800 0.246753 0.930736 0.914286 0.904762 0.240260 0.761905 0.714286 0.171429 0.642857 0.792208 0.365967 0.380952 0.329004 0.662338 0.000000