sqlteradata

Rewriting query


I have a scenario where I want to take only one row from a group based on the column values in TID column. For example my key columns in the table SRN,GIO and FID. Now there is another column TID, now within a group when two TIDS given in case statement are present in the group, then I need to take one with rank 1(calculated), otherwise we can take the other record. Also note that there are records which are not part of this conditions which should also come as output. Given sample records, and also expected output below

Input data

SRN GIO FID TID
ASD123 1 100 218
ASD123 1 100 303
ASD123 1 100 454
ASD123 1 100 678
DEF234 0 6566 2222
DEF234 0 6566 307
DEF234 0 6566 2IUD
WER453 1 8123 308
WER453 1 8123 2wER
WER453 1 8123 56TY

I have written the query as below, but since there are multiple case conditions, its going for full table scan for each condition, which is causing performance as there are lot of records in the table. so is there a better way to rewrite this query.

    select case when TID in('218','303')
then RANK() OVER (PARTITION BY SRN, GIO,FID ORDER BY CASE when TID ='218' then 1 else 2 END) end COL_3 ,
case when TID in('2222','307')
then RANK() OVER (PARTITION BY SRN, GIO,FID ORDER BY CASE when TID ='2222' then 1 else 2 END) end COL_4 ,
case when TID in('2202','308')
then RANK() OVER (PARTITION BY SRN, GIO,FID ORDER BY CASE when TID ='2202' then 1 else 2 END) end COL_5 ,
case when TID in('2220','305')
then RANK() OVER (PARTITION BY SRN, GIO,FID ORDER BY CASE when TID ='2220' then 1 else 2 END) end COL_6 ,
case when TID in('2211','311')
then RANK() OVER (PARTITION BY SRN, GIO,FID ORDER BY CASE when TID ='2211' then 1 else 2 END) end COL_7
from table where
((COL_3 = 1 or COL_4 = 1 OR COL_5=1 OR COL_6=1 OR COL_7=1)
OR
(COL_3 is null and COL_4 is null and COL_5 is null and COL_6 is null and COL_7 is null)
)

Expected output

SRN GIO FID TID
ASD123 1 100 218
ASD123 1 100 454
ASD123 1 100 678
DEF234 0 6566 2222
DEF234 0 6566 2IUD
WER453 1 8123 308
WER453 1 8123 2wER
WER453 1 8123 56TY

Solution

  • To rewrite the query for better performance and avoid multiple full table scans, you can use a single ranking query with a CASE statement that dynamically handles all your conditions. This eliminates the need for multiple RANK operations for each condition, consolidating the logic into a single query.

    You can try this:

    WITH RankedData AS (
        SELECT
            SRN,
            GIO,
            FID,
            TID,
            RANK() OVER (
                PARTITION BY SRN, GIO, FID
                ORDER BY 
                    CASE
                        WHEN TID IN ('218', '303') THEN 
                            CASE WHEN TID = '218' THEN 1 ELSE 2 END
                        WHEN TID IN ('2222', '307') THEN 
                            CASE WHEN TID = '2222' THEN 1 ELSE 2 END
                        WHEN TID IN ('2202', '308') THEN 
                            CASE WHEN TID = '2202' THEN 1 ELSE 2 END
                        WHEN TID IN ('2220', '305') THEN 
                            CASE WHEN TID = '2220' THEN 1 ELSE 2 END
                        WHEN TID IN ('2211', '311') THEN 
                            CASE WHEN TID = '2211' THEN 1 ELSE 2 END
                        ELSE NULL
                    END
            ) AS rank_in_group
        FROM your_table
    ),
    FilteredData AS (
        SELECT *
        FROM RankedData
        WHERE rank_in_group = 1
        UNION ALL
        SELECT *
        FROM RankedData
        WHERE rank_in_group IS NULL
    )
    SELECT SRN, GIO, FID, TID
    FROM FilteredData;