I have a multi-index pivot table looks like the following, where Date is the index, and columns are [country, attributes]:
net_supply = pivot_filter.pivot_table(values='Production', index='Date', columns=['country','Attributes'],aggfunc='sum',fill_value=0)
country1 country2
prod runs imps prod runs imps
2018-01-01 10 -1 3 5 -6 3
2018-02-01 11 -11 2 13 -2 4
2018-03-01 8 -7 1 1 -5 2
2018-04-01 4 -7 5 10 -7 5
2018-05-01 8 -10 4 11 -2 5
I would like to add a conditional subtotal sum total for each country that checks:
if prod + runs >= 0, then add prod + runs + imps if true, else only add prod + runs.
This would be for each of the countries by date index. My ideal output would only include that new summed value like below:
country1 country2
subtotal subtotal
2018-01-01 12 -1
2018-02-01 2 15
2018-03-01 2 -4
2018-04-01 -3 8
2018-05-01 -2 14
Note that some countries only have a few of the 3 attributes (prod, or runs or any combination). I know margins can do a straight sum but not with conditions. Any help would be great!
First, get the result of "prod + runs", then get the result of all 3 columns, then finally use np.where
to set the correct result based on the condition.
sum_df = df.drop('imps', axis=1, level=1).groupby(level=0, axis=1).sum()
sum_df[:] = np.where(sum_df >= 0, df.groupby(level=0, axis=1).sum(), sum_df)
sum_df
country1 country2
2018-01-01 12 -1
2018-02-01 2 15
2018-03-01 2 -4
2018-04-01 -3 8
2018-05-01 -2 14