sqlpostgresqlhierarchical-dataltree

PostgreSQL ltree - How to get left/right most path and children from a Binary tree constructed with ltree?


I am using PostgreSQL and ltree for constructing a large set of binary tree data. For a particular logic I have to get the left/right most path of a given node.

Sample of my Binary tree

enter image description here

Sample of my table content

enter image description here

Sample input and expected output:

Input - node 1, left most children

Output - 1, 1.L2, 1.L2.L3,... (Only the outer left most children)

I would like to get this result in postgresql, ltree query.

Please help me to solve this out.

Any better postgre table design can also suggest but the that must get good performance with large amount of data.


Solution

  • A traditional implementation of a tree in SQL is the node_id - parent_id model, which implies the use of recursion in queries (see for example Recursive CTE concatenate fields with parents from arbitrary point). The Ltree extension is an alternative to this approach that allows you to avoid recursive queries in many cases. You seem to be trying to mix both methods in your approach. If you want to use ltree, you basically only need one column to store the whole tree structure:

    create table my_table(
        id int primary key,
        tree ltree,
        person_id int);
    

    According to the normalization rules, you should avoid columns containing data that can be derived from other columns. Note that parent, level and side columns are redundant as tree already contains this information:

    select 
        tree, 
        nlevel(tree) as level, 
        subpath(tree, 0, -1) as parent,
        substring(subpath(tree, -1, 1)::text from '[R|L]') as side
    from my_table
        
        tree    | level | parent  | side
    ------------+-------+---------+------
     1          |     1 |         |
     1.L2       |     2 | 1       | L
     1.R2       |     2 | 1       | R
     1.L2.L3    |     3 | 1.L2    | L
     1.L2.R3    |     3 | 1.L2    | R
     1.L2.L3.L4 |     4 | 1.L2.L3 | L
     1.L2.R3.R4 |     4 | 1.L2.R3 | R
     1.L2.R3.L4 |     4 | 1.L2.R3 | L
    (8 rows)
    

    Back to your main question, a formal solution may use recursion:

    with recursive recursive_tree as (
        select *
        from my_table
        where tree = '1'
    union all
        select t.*
        from my_table t
        join recursive_tree r 
            on subpath(t.tree, 0, -1) = r.tree
            and left(subpath(t.tree, -1, 1)::text, 1) = 'L'
        )
    select *
    from recursive_tree
    

    but you can also interpret ltree values as text in a smart way that leads to a simpler and faster query:

    select *
    from my_table
    where subpath(tree, 0, 1) = '1'
    and tree::text not like '%R%'
    
     id |    tree    | person_id
    ----+------------+-----------
      1 | 1          |         1
      2 | 1.L2       |         2
      4 | 1.L2.L3    |         4
      6 | 1.L2.L3.L4 |         6
    (4 rows)
    

    Db<>fiddle.