[SOLVED] Select min or max based on where duplicate value exists

Select min or max based on where duplicate value exists

I'm trying to find a way to select the minimum or maximum from a range of data based on these conditions:

Group by columns: location_cd, machine_cd, product_cd, temperature
If multiple rows have the same value of MAX(created_timestamp) then select the row with the MIN(data_source)
Otherwise, just select the row with the MAX(created_timestamp).

Sample table: sold_out

location_cd	machine_cd	product_cd	temperature	data_source	created_timestamp
0012345678	1234567890	123456	1	3	2024/06/10 15:30:15
0012345678	1234567890	123456	1	4	2024/06/10 15:30:15
0012345678	1234567890	654321	0	3	2024/06/10 14:50:15
0012345678	1234567890	654321	0	4	2024/06/10 15:00:15

Expected output :

location_cd	machine_cd	product_cd	temperature	data_source	created_timestamp
0012345678	1234567890	123456	1	3	2024/06/10 15:30:15
0012345678	1234567890	654321	0	4	2024/06/10 15:00:15

I tried this, but it is not working.

SELECT machine_cd,
    location_cd,
    product_cd,
    temperature,
    case WHEN exists(
        SELECT machine_cd,
            location_cd,
            product_cd,
            temperature,
            created_timestamp,
            COUNT(*)
        From sold_out
        group by machine_cd,
            location_cd,
            product_cd,
            temperature,
            created_timestamp
        HAVING COUNT(*) > 1
    ) THEN MIN(data_source) ELSE MAX(created_timestamp) END
FROM sold_out
GROUP BY machine_cd,
    location_cd,
    product_cd,
    temperature

Solution

Like all TOP 1 per group problems you can just create a row number partitioned and ordered as per your requirements and use that to filter the rows you need e.g.

with cte as (
    select *
        , row_number() over (partition by location_cd, machine_cd, product_cd, temperature order by created_timestamp desc, data_source asc) rn
    from TestData
)
select location_cd, machine_cd, product_cd, temperature, data_source, created_timestamp
from cte
where rn = 1
order by location_cd, machine_cd, product_cd, temperature, data_source, created_timestamp;

DBFiddle