sqloraclepaginationsql-order-by

sql pagination when 'order by' non-deterministic


I've been using the same chunk of sql pagination code for years.. and I only now have noticed this oddity.. I think it's pretty interesting, and we should discuss it. So here is the standard pagination boilerplate:

  SELECT a.*
  FROM (SELECT b.*,
               rownum b_rownum
          FROM (SELECT c.*
                  FROM some_table c
                 ORDER BY some_column) b
         WHERE rownum <= <<upper limit>>) a
 WHERE b_rownum >= <<lower limit>>`

which works fantastic if some_column is sequential.

but I'm exposing this and allowing user to sort on any column, and if they happen to choose some_column that's full of the same values - the pagination 'breaks'.

that is to say, the query will return the same exact row data page after page. And as I think about it, why wouldn't it. it probably just picks the fastest rows or whatever that pass the filter.

so for example, these sql return the exact same data

select *
from (select a.*, ROWNUM rnum
      from ( select * from xsd order by PREFIX asc ) a
      where ROWNUM <= 30
     )
where rnum  >= 20;


select *
from (select a.*, ROWNUM rnum
      from ( select * from xsd order by PREFIX asc ) a
      where ROWNUM <= 40
     )
where rnum  >= 30;

mostly I just thought that was neat & I didn't see much about that side-effect in other posts.

And also, I wondered what I could do about it.. and if other strategy would add processing time.

SOLUTION from Gordon, Alex

just add the rowid as a default final order by. these SQL are now different. I didn't notice any change in response time

select *
from (select a.*, ROWNUM rnum
      from ( select * from xsd order by PREFIX asc, rowid ) a
      where ROWNUM <= 30
     )
where rnum  >= 20;


select *
from (select a.*, ROWNUM rnum
      from ( select * from xsd order by PREFIX asc, rowid ) a
      where ROWNUM <= 40
     )
where rnum  >= 30;

another win for stackoverflow, thanks gentelmen, happy coding everyone


Solution

  • SQL tables represent unordered sets. As a consequence, SQL sorts are not stable. That is, rows with the same key value(s) can be in any order -- and on multiple runs the order can change. There is no "underlying" order for the rows.

    The solution is to add a unique id to the end of the order by. This would always be the last key:

    select *
    from (select a.*, ROWNUM as rnum
          from ( select * from xsd order by PREFIX asc, id ) a
          where ROWNUM <= 40
         )
    where rnum  >= 30;
    

    Whether or not this affects performance depends on whether indexes can be used for the sorting. If no indexes are used, then the effect should be very minor. If adding the extra key prevents an index from being used, though, then the impact is much larger.