sqlsql-serverrow-number

row_number giving me trouble when called from a join


When I run the code below I get no results, but if I remove the b.row field from the first select statement, I do get the expected result set. What I want ultimately, is to have a column in the result set that indicates the rank or order of each of rows from the subquery done to the Dim table

Select b.row, a.id, b.year, b.month, b.max_week, a.Sector, a.Item
from FACT a
 join   
    (Select top 12 row_number() over (order by year asc, month asc) as row,
    year, month, max(week) max_week, max(id) max_id
        from DIM  
        group by year, month ) b
on a.id = b.max_id
and Sector <> '3101' 

result set :

enter image description here


Solution

  • Most likely reason is that TOP 12 returned an arbitrary set of 12 rows in the absence of an ORDER BY clause. The ORDER BY you used in the ROW_NUMBER() function is only used for the purposes of evaluating row numbers.

    When I said 'arbitrary', it is not actually so; the rows from the sub-query would be returned to the higher level in an order determined by how the RDBMS executes that sub-query; it may or may not involve sorting. In any case you can't rely on the server's internal sorting; you must specify how you want the ordering is done.

    It may so happen that when you didn't specify b.row, the engine removed the row_number()... from the sub-query because it is not used anywhere, and returned rows from it in a different way. When you specified b.row, it had to sort the rows in the subquery to evaluate row_number() and it returned them in that order (but you can't rely on it).

    Select b.row, a.id, b.year, b.month, b.max_week, a.Sector, a.Item
    from FACT a
     join   
        (Select top 12 row_number() over (order by year asc, month asc) as row,
        year, month, max(week) max_week, max(id) max_id
            from DIM  
            group by year, month 
            order by year, month  --<< Add this
        ) b
    on a.id = b.max_id
    and Sector <> '3101'