hiverecursive-query

Simulate recursive formula


I'm working on HIVE.

I have the following data:

Date WholeBudget Target PocketA PocketA PocketB
2024-01-01 60 75% 45 15
2024-01-02 80 75% 60 20
2024-01-03 100 50% 60 40
2024-01-04 110 50% 60 50
2024-01-05 120 50% 60 60
2024-01-06 130 50% 65 65
2024-01-07 140 50% 70 70

Date is an input on already existing data.
WholeBudget and Target PocketA are also inputs (actually already computed values using LAG shenanigans).
PoketA is the goal. Its formula is: PocketA(d) = MAX(PocketA(d-1), WholeBudget(d) x Target(d))
where d is the date on the data line.
PocketB is simply the remainder: PocketB(d) = WholeBudget(d) - PocketA(d)

Whole budget is strictly increasing at varying speed.
Pockets are not allowed to decrease (thus the max in PocketA formula).
Target PocketA can decrease. That's why, sometimes, PocketA is above target and waits for PocketB to catch on.

I don't think PocketA is doable with LAG() as it is referencing the previous value as part of the computation.

I'm working on a WHILE loop over all the dates to get it. Although it's an acceptable solution as the number of dates is capped by the current year (max ~200 working days), it still hurts my soul.

Is there a smarter solution working on hive?

To clarify as requested:
Here is the provided data to reproduce:

Date WholeBudget Target PocketA
2024-01-01 60 75%
2024-01-02 80 75%
2024-01-03 100 50%
2024-01-04 110 50%
2024-01-05 120 50%
2024-01-06 130 50%
2024-01-07 140 50%

And here is the expected result:

Date PocketA PocketB
2024-01-01 45 15
2024-01-02 60 20
2024-01-03 60 40
2024-01-04 60 50
2024-01-05 60 60
2024-01-06 65 65
2024-01-07 70 70

Solution

  • I'd calculate it naively (Budget * Target) and then use a window function to look back at ALL previous results (including the current row) and take the max.

    WITH
      calc_pa AS
    (
      SELECT
        date,
        WholeBudget,
        Target,
        MAX(WholeBudget * Target) OVER (ORDER BY date) AS PocketA
      FROM
        yourTable
    )
    SELECT
      *,
      WholeBudget - PocketA   AS PocketB
    FROM
      calc_pa