sqlsql-serverdatabase-cursor

Optimized Alternative of CURSOR


I have the following SQL query that uses CURSOR to iterate over each row of the INNER JOIN result. This is not performing well for large sets of data. I need to optimize the query. I tried to find a solution without using the CURSOR. But got no solution.

What is my target?

The idea is to insert at least 2 data for each row of the inner join result. And may insert an extra 3rd row based on a condition.

So, let's say, we have an InnerJoin result table like below

InnerJoin Table

PID E.CS E.CE W.CS W.CE
1 1 1 1 1
1 1 2 12 14
2 3 18 9 9

And, for each row of the InnerJoin table, we want to insert rows into another table based on the following rule. Here, 2 rows are surely inserted according to Rule#1 and Rule#2. For Rule#3, a third row may or may not be inserted into the RESULT table.

RULES

So, here is what the result table would look like,

Result Table @RETTB

PID CS CE
1 1 1
1 1 1
1 1 2
1 12 14
1 12 14
2 3 18
3 9 18

I have done this with CURSOR by iterating over the InnerJoin table and for each iteration, I insert rows to the result table based on the RULES.

Here is the SQL code,

DECLARE 
    @W_CS NUMERIC,
    @W_CE NUMERIC,

    @E_CS NUMERIC,
    @E_CE NUMERIC,

    @CS_1 NUMERIC,
    @CE_1 NUMERIC,

    @CS_2 NUMERIC,
    @CE_2 NUMERIC,

    @maxLoopCount NUMERIC = 5000
-- The CURSOR loop is currently running for a maximum of 5000 times. 
-- This means, that whatever the volume of the InnerJoin table is, the CURSOR will iterate at max 5k times.

DECLARE Cursor_1 CURSOR FOR
SELECT  E.PID,
        E.EDATE,
        E.DAY,
        E.OpID,
        E.CS, 
        E.CE,
        W.CS, 
        W.CE,   
FROM    @Cat E
INNER JOIN @Weekly W
        ON  E.PID = W.PID
        AND E.EDATE = W.EDate
        AND E.DAY = W.DAY   
        AND E.OpID = W.OpID

OPEN    Cursor_1;

---- Perform the first fetch.  
FETCH NEXT FROM Cursor_1    INTO 
        @E_CS,
        @E_CE,
        @W_CS,
        @W_CE;  

DECLARE @insertRowCount INT = 0


WHILE (@@FETCH_STATUS = 0  AND @maxLoopCount > 0)
BEGIN           
    -- control the loop to prevent deathlock
    Set @maxLoopCount = @maxLoopCount - 1


    SET         @insertRowCount = 0

    INSERT INTO @RETTB  
    SELECT      @E_CS,  @E_CE



    IF (@W_CS < @E_CE)                                      
    BEGIN
        SELECT  @insertRowCount = 2
                ,@CS_1  = @mW_StartCount
                ,@CE_1  = @mE_StartCount
                ,@CS_2  = @mE_EndCount
                ,@CE_2  = @mW_EndCount
    END
    ELSE
    BEGIN
        SELECT  @insertRowCount = 1
                ,@CS_1    = GREATEST(E_CS, W_CS) 
                ,@CE_1  =  GREATEST(E_CE, W_CE)
    END 

    IF @insertRowCount >= 1
    BEGIN
        INSERT INTO @RETTB
        SELECT @CS_1, @CE_1
    END

    IF @insertRowCount = 2
    BEGIN
        INSERT INTO @RETTB
        SELECT      @CS_2, @CE_2
    END

    ---- Perform the next fetch.  
    FETCH NEXT FROM Cursor_1    INTO 
        @E_CS,
        @E_CE,
        @W_CS,
        @W_CE;  

END

CLOSE   Cursor_1;
DEALLOCATE Cursor_1;

What problem am I facing?

This approach is very slow. I need to do it with a set-based approach. But cannot figure it out.


Solution

  • What you are looking for can likely be formed using a CROSS APPLY that generates the values you need to insert, derived from the preceding FROM clause content.

    Typically, the CROSS APPLY would contain a VALUE tuples collection or a sequence of SELECTs merged with a UNION ALL. The SELECT/UNION approach allows you to also apply WHERE conditions.

    Something like:

    INSERT Target
    SELECT C.*
    FROM E
    JOIN W ON ...
    CROSS APPLY (
        SELECT E.CS, E.CE
        UNION ALL
        SELECT GREATEST(E.CS, W.CS) AS CS, GREATEST(E.CE, W.CE) AS CE
        UNION ALL
        SELECT W.CS, W.CE WHERE W.CS > E.CE
    ) C
    

    You may need to tweak the CROSS APPLY to match your original logic, but the important point of this answer is the technique.

    The GREATEST() function is available in SQL Server 2022 and later. For earlier versions you can use a CASE expression of IIF() function like CASE WHEN E.CS > W.CS THEN E.CS ELSE W.CS END OR IIF(E.CS > W.CS, E.CS, W.CS).