sqltype-2-dimension

Selecting from a slowly changing dimension type II


I am having trouble with a slowly changing dimension type II selection. I am looking to select the actual length of an employees skill certification so that i can display start and end of his certification in a report. How would you go about doing that? Below i have included an image of the records for an employee '80844' and below the records the expected result.

I am using Microsoft SQL Server 2008 R2.

Records an employee 80844 and the expected result needed to display his certification


Solution

  • Thank you for interesting question. It is well-known 'gaps and islands' problem. You can read more about it there.

    For your case solution could look like this:

        create table #tmp 
           (
             dimEmployeeID int not null,
             EmployeeNumber int not null,
             Start datetime not null,
             [End] datetime not null,
             SkillID int not null
           )
    
    
       insert into #tmp values
         (386 , 80844, '1900-01-01', '2012-02-28', 14),
         (1172, 80844, '2012-02-29', '2012-02-29', 14),
         (1173, 80844, '2012-03-01', '2012-04-01', 14),
         (1175, 80844, '2012-04-06', '2012-04-12', 14),
         (1176, 80844, '2012-04-13', '2012-04-21', 14),
         (1172, 80844, '2012-02-29', '2012-02-29', 87),
         (1173, 80844, '2012-03-01', '2012-04-01', 87),
         (1174, 80844, '2012-04-02', '2012-04-05', 87),
         (1175, 80844, '2012-04-06', '2012-04-12', 87)
    
       ;with StartingPoints as
       (
         SELECT EmployeeNumber, SkillID, Start, ROW_NUMBER() OVER(partition by EmployeeNumber,    SkillID order by Start asc) AS rn
         FROM #tmp AS A
         WHERE NOT EXISTS
           (
             SELECT 1
             FROM #tmp AS B
             WHERE 
               A.EmployeeNumber = B.EmployeeNumber
               and A.SkillID = B.SkillID
               and A.Start - 1 = B.[End]
            )
       ),
       EndingPoints AS
       (
         SELECT EmployeeNumber, SkillID, [End], ROW_NUMBER() OVER(partition by EmployeeNumber,    SkillID order by Start asc) AS rn
         FROM #tmp AS A
         WHERE NOT EXISTS
           (
             SELECT 1
             FROM #tmp AS B
             WHERE 
               A.EmployeeNumber = B.EmployeeNumber
               and A.SkillID = B.SkillID
               and A.[End] + 1 = B.Start
            )
       )
       SELECT S.EmployeeNumber, S.SkillID, S.Start, E.[End]
       FROM StartingPoints AS S
         JOIN EndingPoints AS E
       ON 
         S.EmployeeNumber = E.EmployeeNumber
         and S.SkillID = E.SkillID
         and S.rn = E.rn