sqlmysqldata-warehousehierarchical-data

Extrapolating self-referencing keys from a hierarchy column


I'm working with a dataset in MySQL that includes a hierarchy column with formats like:

1
1.A
1.A.1
1.A.1.a
1.A.1.b
1.A.2
1.A.2.a
1.B.1
1.B.1.a
2
2.A
...

I'd like to build a table with a self-referencing key to the parent (the last value before the most recent dot) by parsing this out on the fly in a SELECT. Is there a way to do this in MySQL or will I need to use an external script?


Solution

  • WITH cte AS (
      SELECT t1.id, MAX(t2.path) parent_path
      FROM test t1
      JOIN test t2 ON t1.path LIKE CONCAT(t2.path, '.%')
      GROUP BY t1.id
    )
    UPDATE test t1
    JOIN cte ON t1.id = cte.id
    JOIN test t2 ON cte.parent_path = t2.path
    SET t1.parent_id = t2.id,
        t1.node_name = TRIM(LEADING '.' FROM SUBSTRING(t1.path FROM 1 + LENGTH(cte.parent_path)));
    SELECT * FROM test;
    
    id path parent_id node_name
    1 1 null null
    2 1.A 1 A
    3 1.A.1 2 1
    4 1.A.1.a 3 a
    5 1.A.1.b 3 b
    6 1.A.2 2 2
    7 1.A.2.a 6 a
    8 1.B.1 1 B.1
    9 1.B.1.a 8 a
    10 2 null null
    11 2.A 10 A

    fiddle

    PS. The node name in the source table must be defined as UNIQUE (and maybe NOT NULL).