pandasgroup-bypivotpivot-tablebitflags

Groupby and Count Flags as indexes in Pandas


I have a dataframe which has flags 0/1 for multiple products along with accounts and which zipcode they belong to. My Goal is to count the 1's in columns which have been created as flags.

    Zip     acc     A   B
    32123   214124  1   0
    32123   124124  0   0
    32123   124124  1   1
    32123   124124  1   1
    12333   112424  1   1
    12333   123131  1   0
    12333   214135  1   0
    12333   123145  1   0

My expected output is in the following format

Zip     Pro #acc
32123   A   3
        B   2
12333   A   4
        B   1

What might be the best way to get to this? I have tried using pd.crosstab/groupby functions but max got to this

g.groupby(['ZIP','A','B']).agg({'ACC':'count'})
c.set_index(['ZIP','A','B'])

Zip     A   B   acc
32123   0   0   1
12333   0   0   2

Solution

  • First, you can .groupby "Zip" and sum those values to get the number you want:

    >>> df = df.groupby("Zip").sum()
              acc  A  B
    Zip                
    12333  572835  4  1
    32123  586496  3  2
    

    Then, pd.melt the data by using "Zip" as id and extracting the values from both "A" and "B" (now the sum from your previous step) to place in your new dataframe:

    >>> df = df.reset_index().melt(id_vars=["Zip"], value_vars=["A", "B"], var_name="Pro", value_name="#acc")
         Zip Pro  #acc
    0  12333   A     4
    1  32123   A     3
    2  12333   B     1
    3  32123   B     2
    

    You can also use both "Zip" and "Pro" as index columns if you want:

    >>> df = df.set_index(["Zip", "Pro"])
               #acc
    Zip   Pro      
    12333 A       4
    32123 A       3
    12333 B       1
    32123 B       2