sql-server t-sql indexing pagination keyset-pagination

Pagination from two tables in SQL Server

I have two tables with the following schema:

Table A: ColumnA, UserId, ... - rest of the schema omitted for brevity
Table B: ColumnB, UserId, ... - rest of the schema omitted for brevity

The tables can have duplicate values between them. For e.g - Table A row (<some-columnA-value>, 1, ...) and Table B row (<some-columnB-value>, 1, ...), 1 being the UserId.

Now, I have an API which is used to fetch all the UserId values from both tables. With increasing data, I want to now use pagination for this API and would like to modify the queries accordingly. There should also not be any duplicates over the pages or within a page.

How do I achieve this? Also a requirement is that I need to use keyset pagination rather than offset pagination since offset pagination gets slower as and when the offset increases.

So far, I have thought of using indexed views since there is only 1 column that I require to fetch but since the data keeps changing quite frequently and in large volumes, the overhead of maintaining the indexed view is not optimal.

Table A:

Column A	UserId
x	1
y	2
z	3
w	4

Table B:

Column B	UserId
a	1
b	3
c	5
d	6

Result (if no page size):

UserId
1
2
3
4
5
6

Result (if page size 3)

Page 1

UserId
1
2
3

Page 2

UserId
4
5
6

Solution

As I mention in my canonical post on Keyset Pagination (paging by key), the base query needs to be efficient.

The key (sic) in your case, is to use a Merge Union, group up by ID, and then take the top 3. This means that the grouping is over an already sorted set, and can use a Stream Aggregation.

So first UNION ALL the tables together, then GROUP BY UserId the final result and take TOP (3).

SELECT TOP (3)
  t.UserId
FROM (
    SELECT a.UserId
    FROM TableA a
    UNION ALL
    SELECT b.UserId
    FROM TableB b
) t
GROUP BY t.UserId
ORDER BY UserId;

Final query plan is nice and neat. Note that the union is keeping the ordering correct for the later stages.

As I mention, you need to keep the previous highest ID value, and pass it in to the next query. Put it into both halves of the union.

SELECT TOP (3)
  t.UserId
FROM (
    SELECT a.UserId
    FROM TableA a
    WHERE a.UserId > @previousId
    UNION ALL
    SELECT b.UserId
    FROM TableB b
    WHERE b.UserId > @previousId
) t
GROUP BY t.UserId
ORDER BY UserId;

db<>fiddle