I need help with optimizing a cursor or changing the code completely. I have the below requirement:
Create column Sequence grouped by ColumnA, ColumnD and GroupA. StartA is used for sorting. Have tried using LAG, Row_Number, etc with no joy since the grouping sequence restart on change of Column D taking into account ColumnA (that can repeat) and GroupA sorted by StartA.
The code below works fine for a small set of records but last time I run it took over 3 hours and did not complete so I have killed the job. The table has over 700,000 records. Looking for any tips on how to improve this. Thank you! Sample result using DENSE_RANK:
DECLARE
@ColumnA VARCHAR(10),
@StartA DATETIME,
@ColumnD VARCHAR(50),
@Sequence INTEGER,
@Sequence_Calc INTEGER = 1,
@Previous_ColumnA VARCHAR(10),
@Previous_ColumnD VARCHAR(50)
SELECT *
INTO #Temp_Table
FROM TABLEA
ORDER BY ColumnA,
ColumnD
DECLARE Seq_Cursor CURSOR
FOR SELECT ColumnA,
StartA,
ColumnD,
Sequence
FROM #Temp_Table
ORDER BY ColumnA,
ColumnD
FOR UPDATE OF Sequence
OPEN Seq_Cursor
FETCH NEXT FROM Seq_Cursor
INTO @ColumnA, @StartA, @ColumnD, @Sequence
WHILE @@FETCH_STATUS= 0
BEGIN
BEGIN
UPDATE #Temp_Table
SET Sequence = @Sequence_Calc
WHERE ColumnD = @ColumnD
AND StartA = @StartA
AND ColumnA = @ColumnA
SET @Previous_ColumnA = @ColumnA
SET @Previous_ColumnD = @ColumnD
END
FETCH NEXT FROM Seq_Cursor
INTO @ColumnA, @StartA, @ColumnD, @Sequence
BEGIN
SELECT @Sequence_Calc = CASE WHEN @Previous_ColumnD = @ColumnD THEN
CASE WHEN @Previous_ColumnA <> @ColumnA THEN @Sequence_Calc + 1 ELSE @Sequence_Calc END
ELSE 1 END
END
END
CLOSE Seq_Cursor
DEALLOCATE Seq_Cursor
Not sure why you're messing around with cursors, they are slow and inefficient, complex to write and complex to understand.
It's really hard to tell without a fuller explanation of the desired logic, but it seems it's a Gaps-and-Islands problem.
You need to use
LAG
to mark the rows that are the start of a new groupCOUNT
to create a group IDROW_NUMBER
partitioned by that ID.WITH StartValues AS (
SELECT *,
CASE WHEN
ColumnA = LAG(ColumnA) OVER (PARTITION BY ColumnD ORDER BY StartA)
AND GroupA = LAG(GroupA) OVER (PARTITION BY ColumnD ORDER BY StartA)
THEN NULL ELSE 1 END AS IsStart
FROM TABLEA a
),
Grouped AS (
SELECT *,
COUNT(IsStart) OVER (PARTITION BY ColumnD ORDER BY StartA) AS GroupID
FROM Grouped
)
SELECT *,
ROW_NUMBER() OVER (PARTITION BY ColumnD, GroupID ORDER BY StartA) AS Sequence
FROM Grouped;