I have a scenario where I want to take only one row from a group based on the column values in TID column. For example my key columns in the table SRN,GIO and FID. Now there is another column TID, now within a group when two TIDS given in case statement are present in the group, then I need to take one with rank 1(calculated), otherwise we can take the other record. Also note that there are records which are not part of this conditions which should also come as output. Given sample records, and also expected output below
Input data
SRN | GIO | FID | TID |
---|---|---|---|
ASD123 | 1 | 100 | 218 |
ASD123 | 1 | 100 | 303 |
ASD123 | 1 | 100 | 454 |
ASD123 | 1 | 100 | 678 |
DEF234 | 0 | 6566 | 2222 |
DEF234 | 0 | 6566 | 307 |
DEF234 | 0 | 6566 | 2IUD |
WER453 | 1 | 8123 | 308 |
WER453 | 1 | 8123 | 2wER |
WER453 | 1 | 8123 | 56TY |
I have written the query as below, but since there are multiple case conditions, its going for full table scan for each condition, which is causing performance as there are lot of records in the table. so is there a better way to rewrite this query.
select case when TID in('218','303')
then RANK() OVER (PARTITION BY SRN, GIO,FID ORDER BY CASE when TID ='218' then 1 else 2 END) end COL_3 ,
case when TID in('2222','307')
then RANK() OVER (PARTITION BY SRN, GIO,FID ORDER BY CASE when TID ='2222' then 1 else 2 END) end COL_4 ,
case when TID in('2202','308')
then RANK() OVER (PARTITION BY SRN, GIO,FID ORDER BY CASE when TID ='2202' then 1 else 2 END) end COL_5 ,
case when TID in('2220','305')
then RANK() OVER (PARTITION BY SRN, GIO,FID ORDER BY CASE when TID ='2220' then 1 else 2 END) end COL_6 ,
case when TID in('2211','311')
then RANK() OVER (PARTITION BY SRN, GIO,FID ORDER BY CASE when TID ='2211' then 1 else 2 END) end COL_7
from table where
((COL_3 = 1 or COL_4 = 1 OR COL_5=1 OR COL_6=1 OR COL_7=1)
OR
(COL_3 is null and COL_4 is null and COL_5 is null and COL_6 is null and COL_7 is null)
)
Expected output
SRN | GIO | FID | TID |
---|---|---|---|
ASD123 | 1 | 100 | 218 |
ASD123 | 1 | 100 | 454 |
ASD123 | 1 | 100 | 678 |
DEF234 | 0 | 6566 | 2222 |
DEF234 | 0 | 6566 | 2IUD |
WER453 | 1 | 8123 | 308 |
WER453 | 1 | 8123 | 2wER |
WER453 | 1 | 8123 | 56TY |
To rewrite the query for better performance and avoid multiple full table scans, you can use a single ranking query with a CASE
statement that dynamically handles all your conditions. This eliminates the need for multiple RANK
operations for each condition, consolidating the logic into a single query.
You can try this:
WITH RankedData AS (
SELECT
SRN,
GIO,
FID,
TID,
RANK() OVER (
PARTITION BY SRN, GIO, FID
ORDER BY
CASE
WHEN TID IN ('218', '303') THEN
CASE WHEN TID = '218' THEN 1 ELSE 2 END
WHEN TID IN ('2222', '307') THEN
CASE WHEN TID = '2222' THEN 1 ELSE 2 END
WHEN TID IN ('2202', '308') THEN
CASE WHEN TID = '2202' THEN 1 ELSE 2 END
WHEN TID IN ('2220', '305') THEN
CASE WHEN TID = '2220' THEN 1 ELSE 2 END
WHEN TID IN ('2211', '311') THEN
CASE WHEN TID = '2211' THEN 1 ELSE 2 END
ELSE NULL
END
) AS rank_in_group
FROM your_table
),
FilteredData AS (
SELECT *
FROM RankedData
WHERE rank_in_group = 1
UNION ALL
SELECT *
FROM RankedData
WHERE rank_in_group IS NULL
)
SELECT SRN, GIO, FID, TID
FROM FilteredData;