pythonpandascommutativity

Is groupby from pandas commutative?


I would like to know if the rows selected by:

groupby(['a', 'b']) 

are the same as the rows selected by:

groupby(['b', 'a'])

In this case the order of the rows doesn't matter.

Is there any case in which groupby does not fulfill the commutative property?


Solution

  • I think order for counts not matter, only after groupby get first columns/ levels in order like you have columns in list.

    df = pd.DataFrame({
            'a':list('aaaaaa'),
             'b':[4,5,4,5,5,4],
             'c':[7,8,9,4,2,3],
    
    })
    

    Order of levels after groupby aggregation:

    df1 = df.groupby(['a', 'b']).sum()
    print (df1)
          c
    a b    
    a 4  19
      5  14
    
    df2 = df.groupby(['b', 'a']).sum()
    print (df2)
          c
    b a    
    4 a  19
    5 a  14
    

    And columns:

    df3 = df.groupby(['a', 'b'], as_index=False).sum()
    print (df3)
       a  b   c
    0  a  4  19
    1  a  5  14
    
    df4 = df.groupby(['b', 'a'], as_index=False).sum()
    print (df4)
       b  a   c
    0  4  a  19
    1  5  a  14
    

    If use transormation for new column with same size like original result is same:

    df['new1'] = df.groupby(['a', 'b'])['c'].transform('sum')
    df['new2'] = df.groupby(['b', 'a'])['c'].transform('sum')
    print (df)
       a  b  c  new1  new2
    0  a  4  7    19    19
    1  a  5  8    14    14
    2  a  4  9    19    19
    3  a  5  4    14    14
    4  a  5  2    14    14
    5  a  4  3    19    19