I would like to know if the rows selected by:
groupby(['a', 'b'])
are the same as the rows selected by:
groupby(['b', 'a'])
In this case the order of the rows doesn't matter.
Is there any case in which groupby
does not fulfill the commutative property?
I think order for counts not matter, only after groupby get first columns/ levels in order like you have columns in list.
df = pd.DataFrame({
'a':list('aaaaaa'),
'b':[4,5,4,5,5,4],
'c':[7,8,9,4,2,3],
})
Order of levels after groupby
aggregation:
df1 = df.groupby(['a', 'b']).sum()
print (df1)
c
a b
a 4 19
5 14
df2 = df.groupby(['b', 'a']).sum()
print (df2)
c
b a
4 a 19
5 a 14
And columns:
df3 = df.groupby(['a', 'b'], as_index=False).sum()
print (df3)
a b c
0 a 4 19
1 a 5 14
df4 = df.groupby(['b', 'a'], as_index=False).sum()
print (df4)
b a c
0 4 a 19
1 5 a 14
If use transormation for new column with same size like original result is same:
df['new1'] = df.groupby(['a', 'b'])['c'].transform('sum')
df['new2'] = df.groupby(['b', 'a'])['c'].transform('sum')
print (df)
a b c new1 new2
0 a 4 7 19 19
1 a 5 8 14 14
2 a 4 9 19 19
3 a 5 4 14 14
4 a 5 2 14 14
5 a 4 3 19 19