sqlsql-servermathtimetime-and-attendance

A mathematical way to divide overlapping time ranges into hours


I am trying to solve a problem of "labor tickets" - where each ticket can have a clock in and clock out time.

But, if multiple tickets overlap, the time should be distributed proportionally.

Here's an example:

Single Ticket (no overlap)

StartTime EndTime Hours
6:00 AM 7:00 AM 1.0

Two tickets (Full overlap)

StartTime EndTime Hours
6:00 AM 7:00 AM 0.5
6:00 AM 7:00 AM 0.5

Three Tickets (Full overlap)

StartTime EndTime Hours
6:00 AM 7:00 AM 0.33
6:00 AM 7:00 AM 0.33
6:00 AM 7:00 AM 0.33

Two Tickets (Partial Overlap)

StartTime EndTime Hours
6:00 AM 7:00 AM 0.875
6:30 AM 6:45 AM 0.125

The overlapping time is divided amongst each ticket

Three Tickets (Complex Overlap)

StartTime EndTime Hours
6:00 AM 7:00 AM 0.33
5:30 AM 7:30 AM 1.08
5:00 AM 7:00 AM 1.08

Or more visually:

enter image description here

Here's a "manual" calculation of the three rows:

  1. 6:00 AM-7:00 AM: There are 3 jobs, so that's 1 hour / 3 = 0.33
  2. 5:30 AM-6:00 AM: There are 2 jobs, so that's 0.5 hours / 2 = 0.25
  3. 6:00 AM-7:00 AM: There are 3 jobs so that's 1/3 = 0.33
  4. 7:00 AM-7:30 AM: Only 1 job so that's 0.5/1 = 0.5
  5. 5:00 AM-7:00 AM:
    • 5:00 AM-5:30 AM: Only 1 job, so 0.5 hours / 1 = 0.5
    • 5:30 AM-6:00 AM: 2 jobs, so 0.5 hours / 2 = 0.25
    • 6:00 AM-7:00 AM: 3 jobs, so 1 hour / 3 = 0.33

To make this simpler, I'm really only trying to make a stored procedure to calculate a single row at a time. So I don't have to worry about the overall recursion of this process.

I'm hoping there is some sort of mathematical formula that would work - I tried taking (current ticket time) divided by (total overlapping time) as a fraction and using that, but that didn't give me a good result on the partial overlaps.

Worst case, I might be able to do this with loops - chunk up our times into "blocks", calculate the time divisor in each blocks, and then add it up again. But, if I can find some sort of mathematical solution, that would be a lot cleaner.

I'm running in SQL Server 2019.

CRME

Here is some sample T-SQL to generate the complex example (SQL Fiddle):

DECLARE @LaborTickets TABLE (
    TicketID INT IDENTITY(1,1),
    StartTime DATETIME,
    EndTime DATETIME
);

INSERT INTO @LaborTickets (StartTime, EndTime)
VALUES
    ('2025-02-04 06:00:00', '2025-02-04 07:00:00'),
    ('2025-02-04 05:30:00', '2025-02-04 07:30:00'),
    ('2025-02-04 05:00:00', '2025-02-04 07:00:00');

and the goal is to return results of the form:

StartTime           EndTime             Hours
------------------  ------------------  -------------------
2025-02-04 6:00 AM  2025-02-04 7:00 AM  0.33333333333333333
2025-02-04 5:30 AM  2025-02-04 7:30 AM  1.08333333333333333
2025-02-04 5:00 AM  2025-02-04 7:00 AM  1.08333333333333333

3 row(s) affected

Edit: In these examples, I am using a maximum of 3 tickets, with nice round number increments.

The real data is a lot more detailed, down to seconds, and there could be a large number of overlapping ones - perhaps up to 20.

Here's another, more complex example:

DECLARE @LaborTickets TABLE (
    TicketID INT IDENTITY(1,1),
    StartTime DATETIME,
    EndTime DATETIME
);

INSERT INTO @LaborTickets (StartTime, EndTime)
VALUES
    ('2025-02-04 06:01:13', '2025-02-04 07:00:00'),
    ('2025-02-04 06:01:21', '2025-02-04 07:01:00'),
    ('2025-02-04 06:01:25', '2025-02-04 07:01:30'),
    ('2025-02-04 06:01:29', '2025-02-04 07:01:50'),
    ('2025-02-04 06:01:35', '2025-02-04 07:01:00'),
    ('2025-02-04 06:01:50', '2025-02-04 07:02:00');

Solution

  • Your "more visually:" section just gave me the solution:
    Just tell SQL to do what you did on a spreadsheet!

    That is, determine "columns" so that each ticket spans one or more complete columns;
    in other words, use the tickets' start and end as virtual scissors that cut others tickets daring to overlap them.

    Simple counts of the (still virtual) cells generated by those cuts (= columns) and tickets (= lines)
    will then tell how many of those unit spans (=cells) are needed by each ticket, or alternatively how many tickets a unit span has running in parallel.
    And this last count is the factor you want to divide the span's duration by, before distributing it to the tickets.

    with
      t as (select * from @LaborTickets),
      -- List all "cuts", where a ticket starts or another stops:
      -- they will be the only places where our "count of parallel tickets" (CTP) can change.
      -- union (without all) ensures we have exactly one occurrence of each timestamp. 
      cuts as (select StartTime t from t union select EndTime from t),
      -- Spans: between two cuts are the monotonic spans (the ranges where the CTP is stable).
      ordCuts as
      (
        select
          row_number() over (order by t) id,
          t StartTime,
          lead(t) over (order by t) EndTime,
          datediff(s, t, lead(t) over (order by t)) / 3600.0 duration
        from cuts
      ),
      spans as (select * from ordCuts where EndTime is not null),
       -- Tickets <-> spans link table: which individual spans is each ticket the union of?
      ts as
      (
        select t.TicketID tid, s.id sid
        from t, spans s
        where t.StartTime <= s.StartTime and t.EndTime >= s.EndTime
      ),
      -- Weighted spans: now we can tell how many tickets "use" each span…
      nts as (select sid, count(1) nTickets from ts group by sid),
      -- … and tell how much relative time each ticket will see.
      wspans as (select s.*, duration / nTickets wdur from spans s, nts where nts.sid = s.id),
      -- Ticket duration: just sum the spans of the ticket to get its weighted duration.
      tdur as (select tid, sum(wdur) hours from ts, wspans s where s.id = ts.sid group by tid)
    select StartTime, EndTime, Hours from tdur join t on tdur.tid = t.TicketID order by t.TicketID;
    

    And of course there's the DBFiddle to play with.