I have a recursive hierarchy in a relational database, this reflects teams and their position within a hierarchy.
I wish to flatten this hierarchy into a dimension for data warehousing, it's a SQL Server database, using SSIS to SSAS.
I have a table, teams:
teamid Teamname
1 Team 1
2 Team 2
And a table teamhierarchymapping:
Teamid heirarchyid
1 4
2 2
And a table hierarchy:
sequenceid parentsequenceid Name
1 null root
2 1 Level 1.1
3 1 Level 1.2
4 3 Level 1.2 1
Giving
Level 1.1 (Contains Team 2)
root <
Level 1.2 <
Level 1.2 1 (Contains Team 1)
I want to flatten this to a dimension like:
Team Name Level 1 Level 2 Level 3
Team 1 Root Level 1.1 [None]
Team 2 Root Level 1.2 Level 1.2 1
I've tried various nasty sets of SQL to try and bring that together, and various piping around in SSIS (which I am just starting to pick up), and I'm not finding a solution that brings it together.
Can anyone help?
(Edit corrected issue with sample data, I think)
Do you have an error in your sample data? I can't see how the hierarchy mapping connects to the hierarchy table to get the results you want, unless the hierarchy mapping is teamid 1 => hierid 2 and teamid 2 => hierid 4.
SSIS may not be able to do it (easily), so it may be better to create a OLEDB Source that executes SQL of the following format. Note this does assume you're using SQL Server 2008 as the 'PIVOT' function was introduced there...
WITH hier AS (
SELECT parentseqid, sequenceid, hiername as parentname, hiername FROM TeamHierarchy
UNION ALL
SELECT hier.parentseqid, TH.sequenceid, hier.parentname, TH.hiername FROM hier
INNER JOIN TeamHierarchy TH ON TH.parentseqid = hier.sequenceid
),
teamhier AS (
SELECT T.*, THM.hierarchyid FROM Teams T
INNER JOIN TeamHierarchyMapping THM ON T.teamid = THM.teamid
)
SELECT *
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY teamname ORDER BY teamname, sequenceid, parentseqid) AS 'Depth', hier.parentname, teamhier.teamname
FROM hier
INNER JOIN teamhier ON hier.sequenceid = teamhier.hierarchyid
) as t1
PIVOT (MAX(parentname) FOR Depth IN ([1],[2],[3],[4],[5],[6],[7],[8],[9])) AS pvtTable
ORDER BY teamname;
There's a few different elements to this, and there may be a better way to do it, but for flattening hierarchies, CTE's are ideal.
Two CTEs are created: 'hier' which takes care of flattening the hierarchy and 'teamhier' which is just a helper "view" to make the joins later on simpler. IF you just take the hier CTE and run it, you'll get your flattened view:
WITH hier AS (
SELECT parentseqid, sequenceid, hiername as parentname, hiername FROM TeamHierarchy
UNION ALL
SELECT hier.parentseqid, TH.sequenceid, hier.parentname, TH.hiername FROM hier
INNER JOIN TeamHierarchy TH ON TH.parentseqid = hier.sequenceid
)
SELECT * FROM hier ORDER BY parentseqid, sequenceid
The next part of it basically takes this flattened view, joins it to your team tables (to get the team name) and uses SQL Server's PIVOT to rotate it round and get everything aligned as you want it. More information on PIVOT is available on the MSDN.
If you're using SQL Server 2005, then you can just take the hierarchy flattening bit and you should be able to use SSIS's native 'PIVOT' transformation block to hopefully do the dirty pivoting work.