I have a dataframe which has flags 0/1 for multiple products along with accounts and which zipcode they belong to. My Goal is to count the 1's in columns which have been created as flags.
Zip acc A B
32123 214124 1 0
32123 124124 0 0
32123 124124 1 1
32123 124124 1 1
12333 112424 1 1
12333 123131 1 0
12333 214135 1 0
12333 123145 1 0
My expected output is in the following format
Zip Pro #acc
32123 A 3
B 2
12333 A 4
B 1
What might be the best way to get to this? I have tried using pd.crosstab/groupby functions but max got to this
g.groupby(['ZIP','A','B']).agg({'ACC':'count'})
c.set_index(['ZIP','A','B'])
Zip A B acc
32123 0 0 1
12333 0 0 2
First, you can .groupby
"Zip" and sum those values to get the number you want:
>>> df = df.groupby("Zip").sum()
acc A B
Zip
12333 572835 4 1
32123 586496 3 2
Then, pd.melt
the data by using "Zip" as id and extracting the values from both "A" and "B" (now the sum from your previous step) to place in your new dataframe:
>>> df = df.reset_index().melt(id_vars=["Zip"], value_vars=["A", "B"], var_name="Pro", value_name="#acc")
Zip Pro #acc
0 12333 A 4
1 32123 A 3
2 12333 B 1
3 32123 B 2
You can also use both "Zip" and "Pro" as index columns if you want:
>>> df = df.set_index(["Zip", "Pro"])
#acc
Zip Pro
12333 A 4
32123 A 3
12333 B 1
32123 B 2