I have the following 2 SQL tables
CREATE TABLE tag_hierarchies (
ancestor_id integer NOT NULL,
descendant_id integer NOT NULL,
generations integer NOT NULL
);
Where generation 0 represents a root
CREATE TABLE tags (
id BIGSERIAL PRIMARY KEY
name VARCHAR
);
I'm unable to generate a query that would return the lowest generation (ie. highest ancestor) per tree. Example trees: a/b/c/d and e would be represented in the database as:
tag_hierarchies
ancestor_id | descendant_id | generation |
---|---|---|
84 | 84 | 0 |
85 | 85 | 0 |
84 | 85 | 1 |
86 | 86 | 0 |
85 | 86 | 1 |
84 | 86 | 2 |
87 | 87 | 0 |
86 | 87 | 1 |
85 | 87 | 2 |
84 | 87 | 3 |
88 | 88 | 0 |
and
tags
id | name |
---|---|
84 | a |
85 | b |
86 | c |
87 | d |
88 | e |
I want to run a query that would return the unique lowest generation (ie. highest ancestor) per tree. So this query will get a base scope of tags and return unique tags. As an example if I run it with b, c & e it would return b & e. Since b has a lower generation between b & c and e is already a root.
Assuming that
"...lowest generation (ie. highest ancestor) per tree ..."
means how deep in hierarchy id appears as descendant, then you could create a cte (grid) to get the lowest generation per tree:
Updated code (after clarifications in the comments):
WITH
grid AS
( Select t.id, t.name,
Max(h.generations) as lowest_generation,
Case When Max(h.generations) > 0 Then Min(h.ancestor_id) Else t.id End as root_ancestor
From tags t
Inner Join tag_hierarchies h ON( h.descendant_id = t.id )
Group By t.id, t.name
)
-- Checking grid resultset
Select g.*
From grid g
Order By g.lowest_generation, g.id
/* R e s u l t :
id name lowest_generation root_ancestor
-- ------ ----------------- -------------
84 a 0 84
88 e 0 88
85 b 1 84
86 c 2 84
87 d 3 84 */
... now if you want to do it on specific tags as mentioned in the question ...
As an example if I run it with b, c & e it would return b & e. Since b has a lower generation between b & c and e is already a root.
... then put your tags (b, c, e) as filter in Where clause of the grid cte and build the Where clause of main sql to filter just the rows with root tags and the tag(s) with the lowest_generation greather than 0 (root) ...
WITH
grid AS
( Select t.id, t.name,
Max(h.generations) as lowest_generation,
Case When Max(h.generations) > 0 Then Min(h.ancestor_id) Else t.id End as root_ancestor
From tags t
Inner Join tag_hierarchies h ON( h.descendant_id = t.id )
Where t.name IN('b', 'c', 'e')
Group By t.id, t.name
)
-- M a i n S Q L :
Select g.id, g.name
From grid g
Where g.lowest_generation IN(Select Min(lowest_generation)
From grid
Where root_ancestor = g.root_ancestor
Group By root_ancestor)
Order By g.id
-- R e s u l t :
-- (for grid Where clause) Where t.name IN('b', 'c', 'e')
id name
-- ----
85 b
88 e
-- R e s u l t :
-- for grid with no Where clause
id name
-- ----
84 a
88 e
See the fiddle here.
NOTE:
The code is the same for both - mysql and postgresql... as well as for almost every dialect (Oracle, SQL Sever, SQLite, MariaDB, ...)