sqlsql-servermedianrow-number

Use row_number() to take middle row in a set of records


previously I opened an issue asking how to take the first row in a set of records which were resolved brilliantly. Pitifully my customer changed his request a little after so this time I'm asking your help to find the middle row in a set of records i.e. having this:

TemperatureID CastingID TemperatureDateTime TemperatureValue
1421294 1073513 2021-01-07 11:53:00.000 1648
1421295 1073513 2021-01-07 11:54:00.000 1698
1421296 1073513 2021-01-07 11:57:00.000 1699

I have to take the 1421295 record, whereas having an even number either the middle + 1 or middle - 1

So starting from this query, which return the 1421294 record to be clear

SELECT * 
    FROM (
        SELECT 
            CastingID, TemperatureValue, TemperatureDateTime
            , ROW_NUMBER() OVER (PARTITION BY CastingID ORDER BY CastingID) RN
        FROM Production.Temperatures
        where YEAR(TemperatureDateTime) = 2021 and PhaseID = 250
    ) A
    WHERE RN = 1 and year(A.TemperatureDateTime) = 2021

can I obtain the result I want or I have to change my approach?


Solution

  • Edit: Updated to use a cte. Edit2: Updated to use a join rather than a sub-select to handle multiple castingIds

    Since ROW_NUMBER() is going to give us a continuous set of values, why not find MAX(RN), divide by 2 and then round? If the MAX(RN) is odd, you'll get the true median, if it's even you'll get rounded down. There's likely a cleaner way to do this but something like this:

      WITH cte AS (
            SELECT 
                temperatureID 
                ,castingID
                ,temperatureValue
                ,ROW_NUMBER() OVER (PARTITION BY castingID ORDER BY TemperatureDateTime) AS RN
            FROM Temperatures
    )
    SELECT
        * 
    FROM cte AS c
    INNER JOIN (
        SELECT
            castingID
            ,CEILING(CONVERT(DECIMAL(7,2),MAX(RN)) / 2) AS med
        FROM cte
        GROUP BY castingID
    ) AS m ON c.rn = m.med AND c.castingID = m.castingID
    

    Here is a SQL Fiddle with the result of the query: https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=5b3aebf3ab4ced946c90a435d9edce3c

    There's three use cases (all with different castingID).

    1.) Odd number of rows

    2.) Even number of rows

    3.) A single row