sqlpostgresqlrecursive-queryrecursive-cte

clean ways to perform calculations with values sourced from nested groups of varying depths?


I would like to use the below tables to calculate z:

item value
a 6
b 8
group source_item weighting
x a 1
x b 0.25
y a 1
group source_group weighting
y x 0.5
z x 1
z y 1

As a human, I'd solve by calculating:

x = 1a + 0.25b = 8

y = 1a + 0.5x = 10

z = 1x + 1y = 18

I'm struggling to find a clean way of doing this in SQL however.

I can get the order of operations fairly easily and (I hope) cleanly, by using a recursive CTE - as follows:

WITH RECURSIVE
    iterations AS (
        --start by getting the groups that aren't dependent on other groups
        SELECT
            g.group
            , 1 AS iteration
        FROM
            groups AS g
            LEFT JOIN groups_groups AS gg
                ON gg.source_group = g.group
        WHERE
            gg.source_group IS NULL
        --traverse dependencies upwards
        UNION ALL
            SELECT
                gg.group
                , i.iteration + 1 AS iteration
            FROM
                iterations AS i
                LEFT JOIN groups_groups AS gg
                    ON gg.source_group = i.group
    )
--filter to max iteration
SELECT DISTINCT
    i.group
    , MAX(i.iteration) OVER (PARTITION BY i.group) AS iteration
FROM
    iterations AS i
ORDER BY
    iteration

This will return the order as follows:

group iteration
x 1
y 2
z 3

From here I planned to use another recursive tree to walk up the iterations and calculate the values. However, since you can only join onto the last recursion, the value of x is inaccessible by the time you reach z. I could probably pull all the values through each recursion and do some join on iteration + 1 to make sure the loop ends appropriately but I started getting the sense that this couldn't be the "correct" way to do things.

Is there a simpler, cleaner, etc. way of solving this?


Solution

  • I think you are overthinking this. You don't need to know the order of operations (unless brackets were involved).

    Just begin with a total of groups_items and items, then recurse over groups_groups, finally summing the total by group.

    WITH RECURSIVE cte AS (
        SELECT
          gi."group",
          SUM(i.value * gi.weighting) AS total
        FROM groups_items gi
        JOIN items i ON i.item = gi.source_item
        GROUP BY
          gi."group"
    
        UNION ALL
    
        SELECT
          gg."group",
          cte.total * gg.weighting AS total
        FROM cte
        JOIN groups_groups gg ON gg.source_group = cte."group"
    )
    SELECT
      cte."group",
      SUM(cte.total) AS total
    FROM cte
    GROUP BY
      cte."group";
    

    db<>fiddle