postgresqlhierarchyltree

postgres hierarchy - fill in missing values with ancestor's value


I have a postgres table containing a hierarchy (ltree):

| hierarchy |
|-----------|
| A         |
| A.X       |
| A.X.Y.    |
| A.X.Y.Z   |
| A.B       |
| A.B.C     |
| A.B.C.D   |
| A.B.C.D.E |

and a table of results:

| marker | tier | hierarchy |
|:------:|:----:|:---------:|
|    1   |   1  |     A     |
|    1   | NULL |   A.X.Y   |
|    1   |   2  |    A.X    |
|    1   | NULL | A.B.C.D.E |
|    1   |   4  |   A.B.C   |
|    1   | NULL |     A     |
|    2   | NULL |    A.B    |
|    2   | NULL |   A.B.C   |

I want to create a view that will replace NULL values in the tier column with the closest ancestor's tier that is non-NULL (i.e. parent tier - if parent tier is null, look at grandparent's tier etc, all the way back to the root of the tree if necessary) for the relevant marker.

The output view for this example would look like this:

| marker | tier | hierarchy |
|:------:|:----:|:---------:|
|    1   |   1  |     A     |
|    1   |   2  |   A.X.Y   |
|    1   |   2  |    A.X    |
|    1   |   4  | A.B.C.D.E |
|    1   |   4  |    A.B    |
|    1   |   1  |     A     |
|    2   | NULL |    A.B    |
|    2   | NULL |   A.B.C   |

Can you help?


Solution

  • This might look like a job for a self-LEFT JOIN, but the join condition is not so clear cut - it actually needs to select the "closest" from multiple possible ancestors. That's easier expressed with a subquery: demo

    CREATE VIEW nice_results AS
    SELECT
      marker,
      COALESCE(tier, (
        SELECT tier
        FROM results AS ancestor
        WHERE ancestor.marker = node.marker
          AND ancestor.tier IS NOT NULL
          AND ancestor.hierarchy @> node.hierarchy
        ORDER BY nlevel(ancestor.hierarchy) DESC
        LIMIT 1
      )) AS tier,
      hierarchy
    FROM results AS node