pythonpandasdataframegroup-byaggregate

How do I use Pandas group-by to get the sum?


I am using this dataframe:

Fruit   Date      Name  Number
Apples  10/6/2016 Bob    7
Apples  10/6/2016 Bob    8
Apples  10/6/2016 Mike   9
Apples  10/7/2016 Steve 10
Apples  10/7/2016 Bob    1
Oranges 10/7/2016 Bob    2
Oranges 10/6/2016 Tom   15
Oranges 10/6/2016 Mike  57
Oranges 10/6/2016 Bob   65
Oranges 10/7/2016 Tony   1
Grapes  10/7/2016 Bob    1
Grapes  10/7/2016 Tom   87
Grapes  10/7/2016 Bob   22
Grapes  10/7/2016 Bob   12
Grapes  10/7/2016 Tony  15

I would like to aggregate this by Name and then by Fruit to get a total number of Fruit per Name. For example:

Bob,Apples,16

I tried grouping by Name and Fruit but how do I get the total number of Fruit?


Solution

  • Use GroupBy.sum:

    df.groupby(['Fruit','Name']).sum()
    
    Out[31]: 
                   Number
    Fruit   Name         
    Apples  Bob        16
            Mike        9
            Steve      10
    Grapes  Bob        35
            Tom        87
            Tony       15
    Oranges Bob        67
            Mike       57
            Tom        15
            Tony        1
    

    To specify the column to sum, use this: df.groupby(['Name', 'Fruit'])['Number'].sum()