I have the following SQL query that uses CURSOR to iterate over each row of the INNER JOIN result. This is not performing well for large sets of data. I need to optimize the query. I tried to find a solution without using the CURSOR. But got no solution.
The idea is to insert at least 2 data for each row of the inner join result. And may insert an extra 3rd row based on a condition.
So, let's say, we have an InnerJoin result table like below
PID | E.CS | E.CE | W.CS | W.CE |
---|---|---|---|---|
1 | 1 | 1 | 1 | 1 |
1 | 1 | 2 | 12 | 14 |
2 | 3 | 18 | 9 | 9 |
And, for each row of the InnerJoin table, we want to insert rows into another table based on the following rule. Here, 2 rows are surely inserted according to Rule#1 and Rule#2. For Rule#3, a third row may or may not be inserted into the RESULT table.
So, here is what the result table would look like,
PID | CS | CE |
---|---|---|
1 | 1 | 1 |
1 | 1 | 1 |
1 | 1 | 2 |
1 | 12 | 14 |
1 | 12 | 14 |
2 | 3 | 18 |
3 | 9 | 18 |
I have done this with CURSOR by iterating over the InnerJoin table and for each iteration, I insert rows to the result table based on the RULES.
Here is the SQL code,
DECLARE
@W_CS NUMERIC,
@W_CE NUMERIC,
@E_CS NUMERIC,
@E_CE NUMERIC,
@CS_1 NUMERIC,
@CE_1 NUMERIC,
@CS_2 NUMERIC,
@CE_2 NUMERIC,
@maxLoopCount NUMERIC = 5000
-- The CURSOR loop is currently running for a maximum of 5000 times.
-- This means, that whatever the volume of the InnerJoin table is, the CURSOR will iterate at max 5k times.
DECLARE Cursor_1 CURSOR FOR
SELECT E.PID,
E.EDATE,
E.DAY,
E.OpID,
E.CS,
E.CE,
W.CS,
W.CE,
FROM @Cat E
INNER JOIN @Weekly W
ON E.PID = W.PID
AND E.EDATE = W.EDate
AND E.DAY = W.DAY
AND E.OpID = W.OpID
OPEN Cursor_1;
---- Perform the first fetch.
FETCH NEXT FROM Cursor_1 INTO
@E_CS,
@E_CE,
@W_CS,
@W_CE;
DECLARE @insertRowCount INT = 0
WHILE (@@FETCH_STATUS = 0 AND @maxLoopCount > 0)
BEGIN
-- control the loop to prevent deathlock
Set @maxLoopCount = @maxLoopCount - 1
SET @insertRowCount = 0
INSERT INTO @RETTB
SELECT @E_CS, @E_CE
IF (@W_CS < @E_CE)
BEGIN
SELECT @insertRowCount = 2
,@CS_1 = @mW_StartCount
,@CE_1 = @mE_StartCount
,@CS_2 = @mE_EndCount
,@CE_2 = @mW_EndCount
END
ELSE
BEGIN
SELECT @insertRowCount = 1
,@CS_1 = GREATEST(E_CS, W_CS)
,@CE_1 = GREATEST(E_CE, W_CE)
END
IF @insertRowCount >= 1
BEGIN
INSERT INTO @RETTB
SELECT @CS_1, @CE_1
END
IF @insertRowCount = 2
BEGIN
INSERT INTO @RETTB
SELECT @CS_2, @CE_2
END
---- Perform the next fetch.
FETCH NEXT FROM Cursor_1 INTO
@E_CS,
@E_CE,
@W_CS,
@W_CE;
END
CLOSE Cursor_1;
DEALLOCATE Cursor_1;
This approach is very slow. I need to do it with a set-based approach. But cannot figure it out.
What you are looking for can likely be formed using a CROSS APPLY
that generates the values you need to insert, derived from the preceding FROM
clause content.
Typically, the CROSS APPLY
would contain a VALUE
tuples collection or a sequence of SELECT
s merged with a UNION ALL
. The SELECT/UNION approach allows you to also apply WHERE
conditions.
Something like:
INSERT Target
SELECT C.*
FROM E
JOIN W ON ...
CROSS APPLY (
SELECT E.CS, E.CE
UNION ALL
SELECT GREATEST(E.CS, W.CS) AS CS, GREATEST(E.CE, W.CE) AS CE
UNION ALL
SELECT W.CS, W.CE WHERE W.CS > E.CE
) C
You may need to tweak the CROSS APPLY
to match your original logic, but the important point of this answer is the technique.
The GREATEST()
function is available in SQL Server 2022 and later. For earlier versions you can use a CASE
expression of IIF()
function like CASE WHEN E.CS > W.CS THEN E.CS ELSE W.CS END
OR IIF(E.CS > W.CS, E.CS, W.CS)
.