sqlsql-servertemporal-tables

Query History of specific columns from a system-versioned table


I have a SQL Server system-versioned table with ~20 columns, all of which hold changing values over time. I am trying to get the changed values for only a subset of the columns along with their respective validity columns.

Some example data:

Name Company Location SysStartTime SysEndTime
Employee1 Company A New York 2023-11-23 05:28:46.9571214 2023-12-07 05:20:40.7315348
Employee1 Company A San Francisco 2023-12-07 05:20:40.7315348 2024-01-26 05:13:37.1539216
Employee1 Company B Berlin 2024-01-26 05:13:37.1539216 2024-01-27 05:13:28.0830253
Employee1 Company A Tokyo 2024-01-27 05:13:28.0830253 2024-03-09 05:12:29.7629149
Employee1 Company A Rome 2024-03-09 05:12:29.7629149 2024-04-13 04:10:13.4617646
Employee1 Company A Kinshasa 2024-04-13 04:10:13.4617646 9999-12-31 23:59:59.9999999
Employee2 Company A Newtown 2023-11-23 05:28:46.9571214 2024-01-26 05:13:37.1539216
Employee2 Company A Oldtown 2024-01-26 05:13:37.1539216 2024-04-13 04:10:13.4617646
Employee2 Company C Towntown 2024-04-13 04:10:13.4617646 9999-12-31 23:59:59.9999999

And the desired output:

Name Company SysStartTime SysEndTime
Employee1 Company A 2023-11-23 05:28:46.9571214 2024-01-26 05:13:37.1539216
Employee1 Company B 2024-01-26 05:13:37.1539216 2024-01-27 05:13:28.0830253
Employee1 Company A 2024-01-27 05:13:28.0830253 9999-12-31 23:59:59.9999999
Employee2 Company A 2023-11-23 05:28:46.9571214 2024-04-13 04:10:13.4617646
Employee2 Company C 2024-04-13 04:10:13.4617646 9999-12-31 23:59:59.9999999

Basically, I want to get the validity periods for column Name and Company while ignoring the changes in Location but using the Sysstartime/sysendtime values created when changing location.

It is possible that I will need additional columns ignoring changes to undesired columns.

What query would give me the desired result?


Solution

  • Use Self-join in first CTE to identify changes in Name. In second CTE groups consecutive records. and finally consolidates them, calculating the earliest start and latest end times for each continuous employment period per company:

    WITH Previous AS (
        SELECT 
            curr.Name,
            curr.Company,
            curr.SysStartTime,
            curr.SysEndTime,
            prev.Company AS PrevCompany
        FROM EmployeeHistory curr
        LEFT JOIN EmployeeHistory prev
            ON curr.Name = prev.Name 
            AND curr.SysStartTime = prev.SysEndTime
    )
    , Grouped AS (
        SELECT *,
            SUM(CASE WHEN Company <> PrevCompany OR PrevCompany IS NULL THEN 1 ELSE 0 END)
            OVER (PARTITION BY Name ORDER BY SysStartTime ROWS UNBOUNDED PRECEDING) AS GroupID
        FROM Previous
    )
    SELECT 
        Name,
        Company,
        MIN(SysStartTime) AS SysStartTime,
        MAX(SysEndTime) AS SysEndTime
    FROM Grouped
    GROUP BY Name, Company, GroupID
    ORDER BY Name, SysStartTime;
    

    DBFiddle