sqlpostgresqlrecursive-querytransitive-closure-table

Recursive query challenge - simple parent/child example


Note: with help from RhodiumToad on #postgresql, I've arrived at a solution, which I posted as answer. If anyone can improve on this, please chime in!

I have not been able to adapt a previous recursive query solution to the following directed acyclic graph that includes multiple "root" (ancestor-less) nodes. I'm trying to write a query whose output is what is commonly known as a closure table: a many-to-many table that stores every path from each node to each of its descendants and to itself:

1  2  11  8  4  5  7
 \/    |  |   \ | /
  3    |   \    6
   \   |    \  /
    9  |     10
     \/     /
     12    13
       \  /
        14

CREATE TABLE node (
  id        SERIAL PRIMARY KEY,
  node_name VARCHAR(50) NOT NULL
);

CREATE TABLE node_relations (
  id                 SERIAL PRIMARY KEY,
  ancestor_node_id   INT REFERENCES node(id),
  descendant_node_id INT REFERENCES node(id)
);

INSERT into node (node_name)
SELECT 'node ' || g FROM generate_series(1,14) g;

INSERT INTO node_relations(ancestor_node_id, descendant_node_id) VALUES
(1,3),(2,3),(4,6),(5,6),(7,6),(3,9),(6,10),(8,10),(9,12),(11,12),(10,13),(12,14),(13,14);

It's been hard to pinpoint the issue(s) - am I'm missing node_relation rows? Is the query wrong?

WITH RECURSIVE node_graph AS (
   SELECT ancestor_node_id, ARRAY[descendant_node_id] AS path, 0 AS level
   FROM   node_relations

   UNION  ALL
   SELECT nr.ancestor_node_id,  ng.path || nr.descendant_node_id,ng.level + 1 AS level
   FROM   node_graph ng
   JOIN   node_relations nr ON nr.descendant_node_id = ng.ancestor_node_id 
)
SELECT path[array_upper(path,1)] AS ancestor,
       path[1] AS descendant,
       path, 
       level as depth
FROM   node_graph
ORDER  BY level, ancestor;

Expected Output:

ancestor | descendant | path
---------+------------+------------------
1        | 3          | "{1,3}"
1        | 9          | "{1,3,9}"
1        | 12         | "{1,3,9,12}"
1        | 14         | "{1,3,9,12,14}"
2        | 3          | "{2,3}"
2        | 9          | "{2,3,9}"
2        | 12         | "{2,3,9,12}"
2        | 14         | "{2,3,9,12,14}"
3        | 9          | "{3,9}"
3        | 12         | "{3,9,12}"
3        | 14         | "{3,9,12,14}"
4        | 6          | "{4,6}"
4        | 10         | "{4,6,10}"
4        | 13         | "{4,6,10,13}"
4        | 14         | "{4,6,10,13,14}"
5        | 6          | "{5,6}"
5        | 10         | "{5,6,10}"
5        | 13         | "{5,6,10,13}"
5        | 14         | "{5,6,10,13,14}"
6        | 10         | "{6,10}"
6        | 13         | "{6,10,13}"
6        | 14         | "{6,10,13,14}"
7        | 6          | "{7,6}"
7        | 10         | "{7,6,10}"
7        | 13         | "{7,6,10,13}"
7        | 14         | "{7,6,10,13,14}"
8        | 10         | "{8,10}"
8        | 13         | "{8,10,13}"
8        | 14         | "{8,10,13,14}"
9        | 12         | "{9,12}"
9        | 14         | "{9,12,14}"
10       | 13         | "{10,13}"
10       | 14         | "{10,13,14}"
11       | 12         | "{11,12}"
11       | 14         | "{11,12,14}"
12       | 14         | "{12,14}"
13       | 14         | "{13,14}"

Solution

  • With help from RhodiumToad on #postgresql, I've arrived at this solution:

    WITH RECURSIVE node_graph AS (
        SELECT ancestor_node_id as path_start, descendant_node_id as path_end,
               array[ancestor_node_id, descendant_node_id] as path 
        FROM node_relations
    
        UNION ALL 
    
        SELECT ng.path_start, nr.descendant_node_id as path_end,
               ng.path || nr.descendant_node_id as path
        FROM node_graph ng
        JOIN node_relations nr ON ng.path_end = nr.ancestor_node_id
    ) 
    SELECT * from node_graph order by path_start, array_length(path,1);
    

    The result is exactly as expected.