sqlsql-servert-sqlhierarchyhierarchyid

Flatten parent child hierarchy with multiple parents


I have a parent-child hierarchy in my source structure where a child could point to his parent and for his parent could exist multiple rows. If we than flatten the hierarchy this would mean that every child row need to exist beneath his parent row.

Image below to clarify

issue

I have already picked out my brains how to resolve this in a performant manner in T-SQL because my source set is 300K rows and this will lead to heavy processing. Help is greatly appreciated!

Code to directly start from the example above see below.

SELECT *
INTO Orders
FROM (SELECT CAST('A' AS NVARCHAR(1)) AS Id, CAST('' AS NVARCHAR(1)) AS ParentId, CAST(10 AS int) AS Qty UNION ALL
SELECT CAST('A' AS NVARCHAR(1)) AS Id, CAST('' AS NVARCHAR(1)) AS ParentId, CAST(20 AS int) AS Qty UNION ALL
SELECT CAST('B' AS NVARCHAR(1)) AS Id, CAST('A' AS NVARCHAR(1)) AS ParentId, CAST(30 AS int) AS Qty UNION ALL
SELECT CAST('B' AS NVARCHAR(1)) AS Id, CAST('A' AS NVARCHAR(1)) AS ParentId, CAST(40 AS int) AS Qty UNION ALL
SELECT CAST('C' AS NVARCHAR(1)) AS Id, CAST('B' AS NVARCHAR(1)) AS ParentId, CAST(50 AS int) AS Qty UNION ALL
SELECT CAST('C' AS NVARCHAR(1)) AS Id, CAST('B' AS NVARCHAR(1)) AS ParentId, CAST(60 AS int) AS Qty UNION ALL
SELECT CAST('C' AS NVARCHAR(1)) AS Id, CAST('B' AS NVARCHAR(1)) AS ParentId, CAST(70 AS int) AS Qty) AS T1

Solution

  • I have no idea why you would want to do this, but this is how I would do it...

    SELECT
      L1.ID,
      L2.ID,
      L3.ID,
      COALESCE(L3.Qty, L2.Qty, L1.Qty)
    FROM
      yourTable   AS L1
    OUTER APPLY
    (
      SELECT * FROM yourTable WHERE parentID = L1.ID
      UNION ALL
      SELECT NULL, NULL, NULL
    )
      AS L2
    OUTER APPLY
    (
      SELECT * FROM yourTable WHERE parentID = L2.ID
      UNION ALL
      SELECT NULL, NULL, NULL
    )
      AS L3
    WHERE
      L1.ParentID IS NULL
    

    You're going to get a LOT of rows though. Just in your example 8 input rows became 18 output rows. 300k input rows could easily become many millions of output rows, depending on the level of duplication in the data.