sqlsql-serverdatabase-cursor

How to get rid of CURSOR if every FETCH needs a SELECT based on previous result


We have a couple of tables, let's call them Sections and SectionTimeWindows. Sections table contains data about sections of 'movement', which starts at a place, ends at a place and has a type. The type is important, because it determines how much time will a section take - every section type has a set of time windows and the window determines the running time of a section based on when we arrive at the section.

We need a new table, where StopTimes are recorded, provided we know a DepartureTime. We start with a stop at the beginning of the first section at StopTime=DepartureTime, then determine RunningTime for every section based on previous StopTime, and we insert the StopTime as new row. This means, that the RunningTime and therefore the StopTime values are based on results computed for previous row(s).

Our current implementation uses a CURSOR to iterate over Sections, keep current StopTime, probe the relevant window for next RunningTime, add RunningTime to StopTime and INSERT.

A simplified version of our code is this:

CREATE TABLE #Sections
(
    SectionID INT PRIMARY KEY IDENTITY,
    SectionTypeID INT,
    StartPlaceID INT,
    EndPlaceID INT,
    [Order] INT,
);

CREATE TABLE #SectionTimeWindows
(
    SectionTimeWindowID INT PRIMARY KEY IDENTITY,
    SectionTypeID INT,
    WindowStart INT,
    WindowEnd INT,
    RunningTime INT,
)

CREATE TABLE #StopTimes
(
    StopTimeID INT PRIMARY KEY IDENTITY,
    PlaceID INT,
    IncommingSectionID INT NULL,
    ArrivalTime INT,
    [Order] INT,
)

INSERT INTO #Sections (SectionTypeID, StartPlaceID, EndPlaceID, [Order]) VALUES 
    (1, 10, 20, 1),
    (1, 20, 30, 2),
    (1, 30, 40, 3),
    (1, 40, 50, 4),
    (1, 50, 60, 5)

INSERT INTO #SectionTimeWindows (SectionTypeID, WindowStart, WindowEnd, RunningTime) VALUES
    (1, 00*60, 09*60, 30), -- from midnight to 9AM 
    (1, 09*60, 18*60, 50), -- from 9AM to 18AM    
    (1, 18*60, 24*60, 5) --- dtto

DECLARE @Departure INT = 8 * 60 -- the departure of the whole movement. Used at the first iteration only. This is a parameter of this (when this is converted to SP)
DECLARE @CurrentTime INT = NULL -- the stop time variable hold the current result of current interation
DECLARE @RunningTime INT = NULL -- the current iteration running time, based on window

DECLARE @CurrentSectionID INT;
DECLARE @CurrentSectionTypeID INT;
DECLARE @CurrentStartPlaceID INT;
DECLARE @CurrentEndPlaceID INT;

DECLARE @Order INT = 1;

DECLARE rowIterator CURSOR
    LOCAL FAST_FORWARD FOR
        SELECT SectionID, SectionTypeID, StartPlaceID, EndPlaceID
        FROM #Sections
        ORDER BY [Order]
OPEN rowIterator;

WHILE (1=1)
BEGIN
    FETCH NEXT FROM rowIterator INTO @CurrentSectionID, @CurrentSectionTypeID, @CurrentStartPlaceID, @CurrentEndPlaceID
    IF (@@FETCH_STATUS <> 0) BREAK;
        
    IF (@CurrentTime is null) -- First iteration, we have no previous results, need @Departure as first StopTime
    BEGIN 
        SET @CurrentTime = @Departure
        INSERT INTO #StopTimes (PlaceID, IncommingSectionID, ArrivalTime, [Order]) 
        VALUES (@CurrentStartPlaceID, NULL, @CurrentTime, @Order)
    END

    SET @Order = @Order + 1
    -- Fetch @RunningTime from signle window where @CurrentStopTime is in the window and SectionType matches
    SELECT TOP 1 @RunningTime = RunningTime FROM #SectionTimeWindows 
        WHERE SectionTypeID = @CurrentSectionTypeID AND WindowStart <= @CurrentTime AND @CurrentTime < WindowEnd
    -- Add @RunningTime to @CurrentStopTime. This value will be used in next iteration to get new window
    SET @CurrentTime = @CurrentTime + @RunningTime
    INSERT INTO #StopTimes (PlaceID, IncommingSectionID, ArrivalTime, [Order])
        VALUES (@CurrentEndPlaceID, @CurrentSectionID, @CurrentTime, @Order)
END
CLOSE rowIterator
DEALLOCATE rowIterator

SELECT StopTimeID, PlaceID, IncommingSectionID, 
ArrivalTime / 60 as Hours, ArrivalTime % 60 as Minutes,
ArrivalTime - LAG(ArrivalTime) Over (ORDER BY [Order]) as RunningTime
FROM #StopTimes


DROP TABLE #Sections
DROP TABLE #StopTimes
DROP TABLE #SectionTimeWindows

The real structure is a bit more complicated, the data is also grouped by MovementID and apart from the ArrivalTime, we also track several other properties. We may also wait at a StopPlace, which delays the departure to the next section as well. But this is the gist of it.

The real procedure takes up to 6 hours to compute (real Sections table has around 10^6 rows and SectionTimeWindows table has around 10^7 rows). The issue is most likely the SELECT in every FETCH iteration, but I don't know how to JOIN the SectionTimeWindows if I don't have the Probe to do it.

I've tried to approach the problem using LAG, but got nowhere, as the previous row's ArrivalTime is a result of it's previous row's ArrivalTime recursively. I have the same issue with SUM OVER(PARTITION..RANGE..), but again, I have no idea how to JOIN the window. I would need some kind of SUM(SELECT...WHERE PROBE)...

Is there any way to remove this kind of CURSOR? And if not, is there any way to speed up the SELECT TOP 1?

Here is a hopefully more clear explanation of what the function should do:

  1. We have a set of Sections, every section has a set of SectionTimeWindows where every window defines an interval of time for which a RunningTime in a Section is valid. We stop between sections. If we arrive at the section between 00:00 and 9:00, the RunningTime in section is 30 minutes; if we arrive between 9:00 and 18:00, the RunningTime is 50 minutes etc
  2. The arrival to a Section is recorded in StopTimes table, ArrivalTime column. There are N+1 StopTimes as we stop before the first section as well.
  3. The arrival to a section is determined by ArrivalTime to previous section + RunningTime given by the SectionTimeWindows where ArrivalTime in previous section falls into a window <WindowStart, WindowEnd)

Solution

  • The timetable does not depend on the places of stops. Therefore, first we make a separate schedule (timetable) for every SectionTypeId.

    From #SectionTimeWindows:

    SectionTimeWindowID SectionTypeID WindowStart WindowEnd RunningTime
    1 1 0 540 30
    2 1 540 1080 50
    3 1 1080 1440 5

    To timetable:
    Converted to timetable. Part of timetable for SectionTypeId=1. We begin from 8:00.

    lvl Num SectionTypeID WindowStart WindowEnd RunningTime hours minutes
    0 null 1 null 480 null 8 0
    1 0 1 480 510 30 8 30
    2 1 1 510 540 30 9 0
    3 2 1 540 590 50 9 50
    4 3 1 590 640 50 10 40
    5 4 1 640 690 50 11 30
    6 5 1 690 740 50 12 20
    7 6 1 740 790 50 13 10
    8 7 1 790 840 50 14 0
    9 8 1 840 890 50 14 50
    10 9 1 890 940 50 15 40

    Then, join places (ordered by [Order]) to timetable.
    Some fiddling with the starting point slightly complicates (confuses) the query.

    See example

    Test data:

    SectionID SectionTypeID StartPlaceID EndPlaceID Order
    1 1 10 20 1
    2 1 20 30 2
    3 1 30 40 3
    4 1 40 50 4
    5 1 50 60 5
    6 2 11 21 1
    7 2 21 31 2
    8 2 31 41 3
    9 2 41 51 4
    10 2 51 61 5
    SectionTimeWindowID SectionTypeID WindowStart WindowEnd RunningTime
    1 1 0 540 30
    2 1 540 1080 50
    3 1 1080 1440 5
    4 2 0 600 40
    5 2 600 1080 60
    6 2 1080 1440 20
    with r as( -- recursively create timetable
    select 0 lvl, null Num,SectionTypeID
      ,null WindowStart
      ,08*60  as WindowEnd  
      ,null RunningTime
    from #SectionTimeWindows w
     where w.WindowStart=(select min(WindowStart) from #SectionTimeWindows w2
                        where w2.SectionTypeId=w.SectionTypeId)
      union all
    select lvl+1,coalesce(r.Num,-1)+1,w.SectionTypeID
      ,r.WindowEnd WindowStart
      ,r.WindowEnd+w.RunningTime WindowEnd
      ,w.RunningTime
    from r inner join #SectionTimeWindows w 
      on r.SectionTypeId=w.SectionTypeId
        and w.WindowStart<=r.WindowEnd and w.WindowEnd>r.WindowEnd 
    -- where lvl<17  -- for debug only
    )
    select 
       s.SectionTypeID, WindowStart, WindowEnd, RunningTime
      ,case when num is null then StartPlaceID else EndPlaceId end PlaceId
      ,case when num is null then null else SectionId end IncomingSectionId
      ,WindowEnd/60 hours
      ,WindowEnd%60 minutes
    from r 
    inner join (select *, row_number()over(partition by SectionTypeId order by [Order] )rn
          from #Sections) s 
      on r.SectionTypeId=s.SectionTypeId 
        and (r.Num=(s.rn-1) or (r.num is null and rn=1))
    order by s.SectionTypeId,lvl;
    
    

    Output

    SectionTypeID WindowStart WindowEnd RunningTime PlaceId Inc.SectionId hours minutes
    1 null 480 null 10 null 8 0
    1 480 510 30 20 1 8 30
    1 510 540 30 30 2 9 0
    1 540 590 50 40 3 9 50
    1 590 640 50 50 4 10 40
    1 640 690 50 60 5 11 30
    2 null 480 null 11 null 8 0
    2 480 520 40 21 6 8 40
    2 520 560 40 31 7 9 20
    2 560 600 40 41 8 10 0
    2 600 660 60 51 9 11 0
    2 660 720 60 61 10 12 0

    fiddle

    Update1.
    We can reduce recursion depth.
    If window size(WindowEnd-WindowStart)>> RunningTime, we should carefully take rows near WindowStart and WindowEnd by recursion and all other rows generate thru generate_series.

    -- timetable with  reduced recursion
    with r as(
    select 0 lvl, 1 repeatCnt, null Num,SectionTypeID
      ,null WindowStart
      ,08*60  as WindowEnd  
      ,null RunningTime
    from #SectionTimeWindows w
     where w.WindowStart=(select min(WindowStart) from #SectionTimeWindows w2
                        where w2.SectionTypeId=w.SectionTypeId)
      union all
    select lvl+1
      ,case when (w.WindowEnd-r.WindowEnd)>w.runningTime then -- multy row
          floor((w.WindowEnd-r.WindowEnd)/w.runningTime)
       else 1 
       end repeatCnt
      ,coalesce(r.Num,-1)+1,w.SectionTypeID
      ,r.WindowEnd WindowStart
      ,case when (w.WindowEnd-r.WindowEnd)>w.runningTime then -- multy row
          r.WindowEnd+floor((w.WindowEnd-r.WindowEnd)/w.runningTime)*w.runningTime
       else r.WindowEnd+w.RunningTime 
       end WindowEnd
      ,w.RunningTime
    from r inner join #SectionTimeWindows w 
      on r.SectionTypeId=w.SectionTypeId
        and w.WindowStart<=r.WindowEnd and w.WindowEnd>r.WindowEnd 
    where lvl<17
    )
    ,ExpandedTimetable as(
      select lvl,repeatCnt,Num,SectionTypeID
        ,WindowStart+RunningTime*(value-1) WindowStart
        ,WindowStart+RunningTime*(value) WindowEnd
        ,WindowStart grStart, WindowEnd grEnd
        ,RunningTime
        ,value n
      from r
      cross apply generate_series(1,r.repeatCnt)tn
      )
     select * from r order by SectionTypeId,lvl;
     -- select * from ExpandedTimeTable order by SectionTypeId,lvl;
    
      select * 
      ,WindowEnd/60 hours
      ,WindowEnd%60 minutes
      from ExpandedTimeTable order by SectionTypeId,lvl;
    
    

    Recursion query output is only several rows. Recursin depth is 4.

    lvl repeatCnt Num SectionTypeID WindowStart WindowEnd RunningTime
    0 1 null 1 null 480 null
    1 2 0 1 480 540 30
    2 10 1 1 540 1040 50
    3 1 2 1 1040 1090 50
    4 70 3 1 1090 1440 5
    0 1 null 2 null 480 null
    1 3 0 2 480 600 40
    2 8 1 2 600 1080 60
    3 18 2 2 1080 1440 20

    Then, we can expand timetable. Every row to repeatNum rows.

    lvl repeatCnt Num SectionTypeID WindowStart WindowEnd grStart grEnd RunningTime n
    0 1 null 1 null null null 480 null 1
    1 2 0 1 480 510 480 540 30 1
    1 2 0 1 510 540 480 540 30 2
    2 10 1 1 540 590 540 1040 50 1
    2 10 1 1 590 640 540 1040 50 2
    2 10 1 1 640 690 540 1040 50 3
    2 10 1 1 690 740 540 1040 50 4
    2 10 1 1 740 790 540 1040 50 5
    2 10 1 1 790 840 540 1040 50 6
    2 10 1 1 840 890 540 1040 50 7
    2 10 1 1 890 940 540 1040 50 8
    2 10 1 1 940 990 540 1040 50 9
    2 10 1 1 990 1040 540 1040 50 10
    3 1 2 1 1040 1090 1040 1090 50 1
    4 70 3 1 1090 1095 1090 1440 5 1
    4 70 3 1 1095 1100 1090 1440 5 2
    4 70 3 1 1100 1105 1090 1440 5 3
    4 70 3 1 1105 1110 1090 1440 5 4

    fiddle