I have a dataframe containing the quarterly data, that looks like this:
Country | Q1 | Q2 | Q1_ratio | Q2_ratio | Total |
---|---|---|---|---|---|
USA | 70 | 30 | 0.7 | 0.3 | 100 |
Canada | 60 | 40 | 0.6 | 0.4 | 100 |
Japan | 40 | 60 | 0.4 | 0.6 | 100 |
TOTAL | 170 | 130 |
My issue is this: When I change the value in Total column, I need to recalculate the values for Q1 and Q2, so that that the sum of Q1 and Q2 equals the row total and also the column total of Q1 and Q2 can't change. Also, the Q1 and Q2 ratios should stay as close to the original values as possible to keep the seasonality.
So, in other words, I need to minimize the difference between quarterly ratios before and after the total value update.
Example: Let's say I change the total value for USA to 80, Canada to 90 and Japan to 130. If I keep the quarterly ratios exactly as they were before, I get this:
Country | Q1 | Q2 | Q1_ratio | Q2_ratio | Total |
---|---|---|---|---|---|
USA | 56 | 24 | 0.7 | 0.3 | 80 |
Canada | 54 | 36 | 0.6 | 0.4 | 90 |
Japan | 52 | 78 | 0.4 | 0.6 | 130 |
TOTAL | 162 | 138 |
But this result doesn't comply with the second constraint (the quarterly column totals can't change).
If I keep the original column totals and do some trial and error to minimize the quarterly ratio differences, I get something like this:
Country | Q1 | Q2 | Q1_ratio | Q2_ratio | Total |
---|---|---|---|---|---|
USA | 58 | 22 | 0.725 | 0.275 | 80 |
Canada | 56.5 | 33.5 | 0.628 | 0.372 | 90 |
Japan | 55.5 | 74.5 | 0.427 | 0.573 | 130 |
TOTAL | 170 | 130 |
I feel like this shouldn't be too hard to figure out but the solution eludes me. Any idea how to calculate this/ implement this in Python?
You haven't shared code, and I don't know exactly what your data looks like, so I'll answer accordingly in pseudocode. In fact this isn't really a Pandas question, besides for the simple extraction/replacement of data.
new Total - old Total
), and for each row, for each quarter, multiply the change by that quarter's ratio. This way your ratio is exactly preserved (the middle table in your example).170 - 162
and 130 - 138
). We see that Q1 needs +8 and Q2 needs -8, so we will distribute this according to the following ratioThis yields:
Country Q1 Q2 Q1_ratio Q2_ratio Total
0 USA 59.29 22.15 0.74 0.28 80
1 Canada 56.82 33.54 0.63 0.37 90
2 Japan 53.88 74.31 0.41 0.57 130
Which is about the same as you got by hand.
Note that this solution does not using analytical minimisation techniques, but just applies row and column ratios in order.