pythonpandasdataframegroup-byapply

Use Pandas groupby() + apply() with arguments


I would like to use df.groupby() in combination with apply() to apply a function to each row per group.

I normally use the following code, which usually works (note, that this is without groupby()):

df.apply(myFunction, args=(arg1,))

With the groupby() I tried the following:

df.groupby('columnName').apply(myFunction, args=(arg1,))

However, I get the following error:

TypeError: myFunction() got an unexpected keyword argument 'args'

Hence, my question is: How can I use groupby() and apply() with a function that needs arguments?


Solution

  • pandas.core.groupby.GroupBy.apply does NOT have named parameter args, but pandas.DataFrame.apply does have it.

    So try this:

    df.groupby('columnName').apply(lambda x: myFunction(x, arg1))
    

    or as suggested by @Zero:

    df.groupby('columnName').apply(myFunction, ('arg1'))
    

    Demo:

    In [82]: df = pd.DataFrame(np.random.randint(5,size=(5,3)), columns=list('abc'))
    
    In [83]: df
    Out[83]:
       a  b  c
    0  0  3  1
    1  0  3  4
    2  3  0  4
    3  4  2  3
    4  3  4  1
    
    In [84]: def f(ser, n):
        ...:     return ser.max() * n
        ...:
    
    In [85]: df.apply(f, args=(10,))
    Out[85]:
    a    40
    b    40
    c    40
    dtype: int64
    

    when using GroupBy.apply you can pass either a named arguments:

    In [86]: df.groupby('a').apply(f, n=10)
    Out[86]:
        a   b   c
    a
    0   0  30  40
    3  30  40  40
    4  40  20  30
    

    a tuple of arguments:

    In [87]: df.groupby('a').apply(f, (10))
    Out[87]:
        a   b   c
    a
    0   0  30  40
    3  30  40  40
    4  40  20  30