I have a lot of data from an old system which defines the data in a Bill of Materials by the position it exists in a table. The BoM data table coming from the old system looks like
ID level ItemNumber
1 1 TopItem
2 .2 FirstChildOfTop
3 .2 2ndChildofTop
4 .2 3ChildOfTop
5 ..3 1stChildof3ChildofTop
6 ..3 2ndChildof3ChildofTop
7 .2 4thChildofTop
8 ..3 1stChildof4ChildTop
9 ...4 1stChildof4ChildTop
10 ..3 2ndChildof4ChildofTop
11 .2 5thChildofTop
12 ..3 1stChildof5thChildofTop
13 ...4 1stChildof1stChildof5thChildofTop
14 ..3 2ndChildof5thChildofTop
15 1 2ndTopItem
16 1 3rdTopItem
In my example the ID is consecutive, the real data the ID can be broken but always lowest to highest as that is how the hierarchy is defined.
By using some simple code to replace the level number with tabs we can get visual hierarchy
1 TopItem
2 FirstChildOfTop
3 2ndChildofTop
4 3ChildOfTop
5 1stChildof3ChildofTo
6 2ndChildof3ChildofTo
7 4thChildofTop
8 1stChildof4ChildTop
9 1stChildof4ChildTop
10 2ndChildof4ChildofTo
11 5thChildofTop
12 1stChildof5thChildof
13 1stChildof1stChildof
14 2ndChildof5thChildof
15 2ndTopItem
16 3rdTopItem
As I have about 5,000 of these lists and they are all between 25 and 55 thousand lines long, I need some code to convert this hierarchy to use sql HierarchyID so we can query at any level in the list. At the moment I hope my explanation shows, you have to work from the top to find in the Item is 2nd, 3rd or some other level and if it has any children. The items in the third column exist in a simple Item Master table but its role in a BoM is defined in these tables only.
I'd offer some code but all my attempts and conversion have failed miserably. I'd claim I'm OK a set based queries
The target is Microsoft SQL 2014 The primary aim is to data warehouse the data but enable to people to find sub-assemblies and where used.
Edit: In answer to Anthony Hancock's very pertinent question I did some work. Please consider the following
ID level ItemNumber sampH lft rgt
1 1 TopItem 1/2 2 28
2 .2 FirstChildOfTop 1/2/3 3 4
3 .2 2ndChildofTop 1/2/3 5 6
4 .2 3ChildOfTop 1/2/3 7 11
5 ..3 1stChildof3ChildofTop 2/3/4 8 9
6 ..3 2ndChildof3ChildofTop 2/3/4 10 11
7 .2 4thChildofTop 1/2/3 13 20
8 ..3 1stChildof4ChildTop 2/3/4 14 17
9 ...4 1stChildof4ChildTop 3/4/5 15 16
10 ..3 2ndChildof4ChildofTop 2/3/4 18 19
11 .2 5thChildofTop 1/2/3/ 20 25
12 ..3 1stChildof5thChildofTop 2/3/4 21 24
13 ...4 1stChildof1stChildof5thChildofTop 3/4/5 22 23
14 ..3 2ndChildof5thChildofTop 2/3/4 26 27
15 1 2ndTopItem 1/2 2 28
16 1 3rdTopItem 1/2 2 28
17 0 verytop 1/ 1 29
Apologies for the awful formatting
1) I have added at line 17 the item we are making - ie this BoM makes the 'verytop' item - so I have renumbered the 'level'
2) I have added in the column 'sampH' column my hand edited PathEnumeratedTree values
3) In the two columns 'lft' and 'rgt' I have added some identifiers of NestedSets data
Please forgive if my hand edited columns aren't correct.
My aim is to get a structure so that someone can query these many deep list to find where an item sits in the tree and what are its children. So I'm open to whatever works.
My testing of the NestedSets - so far - has shown I can do stuff like this:
-- Children of a given parent ItemNumber
Select c.itemnumber, ' is child of 2ndTopItem'
from [dbo].[Sample] as p, [dbo].[Sample] as c
where (c.lft between p.lft and p.rgt)
and (c.lft <> p.lft)
and p.ItemNumber = '2ndTopItem'
But I am completely open to any suggestions how to enumerate the tree structure.
Try the following code:
declare @Source table (
Id int ,
[Level] varchar(20) ,
[Name] varchar(50)
);
declare @Target table (
Id int ,
[Level] int ,
[Name] varchar(50) ,
ParentId int ,
Hid hierarchyid ,
primary key (Id),
unique ([Level], Id),
unique (ParentId, Id)
);
-- 1. The Test Data (Thanks Anthony Hancock for it)
insert into @Source
values
( 1 , '1' , 'TopItem' ),
( 2 , '.2' , 'FirstChildOfTop' ),
( 3 , '.2' , '2ndChildofTop' ),
( 4 , '.2' , '3ChildOfTop' ),
( 5 , '..3' , '1stChildof3ChildofTop' ),
( 6 , '..3' , '2ndChildof3ChildofTop' ),
( 7 , '.2' , '4thChildofTop' ),
( 8 , '..3' , '1stChildof4ChildTop' ),
( 9 , '...4' , '1stChildof4ChildTop' ),
( 10 , '..3' , '2ndChildof4ChildofTop' ),
( 11 , '.2' , '5thChildofTop' ),
( 12 , '..3' , '1stChildof5thChildofTop' ),
( 13 , '...4' , '1stChildof1stChildof5thChildofTop' ),
( 14 , '..3' , '2ndChildof5thChildofTop' ),
( 15 , '1' , '2ndTopItem' ),
( 16 , '1' , '3rdTopItem' );
-- 2. Insert the Test Data to the @Target table
-- with converting of the Level column to int data type
-- to use it as an indexed column in the query # 3
-- (once there are millions of records, that index will be highly useful)
insert into @Target (Id, [Level], [Name])
select
Id,
[Level] = cast(replace([Level],'.','') as int),
[Name]
from
@Source
-- 3. Calculate the ParentId column and update the @Target table
-- to use the ParentId as an indexed column in the query # 4
update t set
ParentId = (
select top 1 Id
from @Target as p
where p.Id < t.Id and p.[Level] < t.[Level]
order by p.Id desc )
from
@Target t;
-- 4. Calculate the Hid column
-- based on the ParentId link and in accordance with the Id order
with Recursion as
(
select
Id ,
ParentId ,
Hid = cast(
concat(
'/',
row_number() over (order by Id),
'/'
)
as varchar(1000)
)
from
@Target
where
ParentId is null
union all
select
Id = t.Id ,
ParentId = t.ParentId ,
Hid = cast(
concat(
r.Hid,
row_number() over (partition by t.ParentId order by t.Id),
'/'
)
as varchar(1000)
)
from
Recursion r
inner join @Target t on t.ParentId = r.Id
)
update t set
Hid = r.Hid
from
@Target t
inner join Recursion r on r.Id = t.Id;
-- 5. See the result ordered by Hid
select
Id ,
[Level] ,
[Name] ,
ParentId ,
Hid ,
HidPath = Hid.ToString()
from
@Target
order by
Hid;
Read more about Combination of Id-ParentId and HierarchyId Approaches to Hierarchical Data