I'm working with a dataset in MySQL that includes a hierarchy column with formats like:
1
1.A
1.A.1
1.A.1.a
1.A.1.b
1.A.2
1.A.2.a
1.B.1
1.B.1.a
2
2.A
...
I'd like to build a table with a self-referencing key to the parent (the last value before the most recent dot) by parsing this out on the fly in a SELECT. Is there a way to do this in MySQL or will I need to use an external script?
WITH cte AS (
SELECT t1.id, MAX(t2.path) parent_path
FROM test t1
JOIN test t2 ON t1.path LIKE CONCAT(t2.path, '.%')
GROUP BY t1.id
)
UPDATE test t1
JOIN cte ON t1.id = cte.id
JOIN test t2 ON cte.parent_path = t2.path
SET t1.parent_id = t2.id,
t1.node_name = TRIM(LEADING '.' FROM SUBSTRING(t1.path FROM 1 + LENGTH(cte.parent_path)));
SELECT * FROM test;
id | path | parent_id | node_name |
---|---|---|---|
1 | 1 | null | null |
2 | 1.A | 1 | A |
3 | 1.A.1 | 2 | 1 |
4 | 1.A.1.a | 3 | a |
5 | 1.A.1.b | 3 | b |
6 | 1.A.2 | 2 | 2 |
7 | 1.A.2.a | 6 | a |
8 | 1.B.1 | 1 | B.1 |
9 | 1.B.1.a | 8 | a |
10 | 2 | null | null |
11 | 2.A | 10 | A |
PS. The node name in the source table must be defined as UNIQUE (and maybe NOT NULL).