sql-serversql-server-2008-r2stable-sort

How to perform stable sort over multiple columns?


Imagine i have a data set that contains:

Date            Id
--------------  ----
11/1/2017       null
11/4/2017       3
11/5/2017       null
11/12/2017      10
null            1
null            2
null            7
null            8
null            9

I want the rows ordered so that both columns are increasing.

Using a naïve ORDER BY Date, ID does not do it:

enter image description here

There is an ordering

There is an ordering that satisfies the results of my desired sort order:

 

enter image description here

Or course, that's not a unique ordering:

Date            Id
--------------  ---------------
null            1
11/1/2017       null
null            2
11/4/2017       3
null            7
null            8
null            9
11/5/2017       null
11/12/2017      10

A programming language could do it

I know i can accomplish this on the client side. In a functional functional programming language: use a stable sorting algorithm:

A stable sort is one which preserves the original order of the input set, where the comparison algorithm does not distinguish between two or more items.

Consider a sorting algorithm that sorts cards by rank, but not by suit. The stable sort will guarantee that the original order of cards having the same rank is preserved; the unstable sort will not.

enter image description here

Unfortunately i have

of monotonically increasing rows to put in best possible chronological sort order. Obviously i'd prefer to do this on the server - which is well suited to handling large amounts of data.

How can i perform a stable-sort in SQL Server?


Example Data

 

CREATE TABLE #SortDemo (Date datetime NULL, Id int NULL)

INSERT INTO #SortDemo (Date, Id)
VALUES 
    ('20171101', null),
    ('20171104',    3),
    ('20171105', null),
    ('20171112',   10),
    (null,          1),
    (null,          2),
    (null,          7),
    (null,          8),
    (null,          9)


SELECT * FROM #SortDemo
ORDER BY Date, Id

Solution

  • It is a solvable problem. You don't have to throw your hands up and say computers cannot be used to solve problems.

    Start with the data, and add a new surrogate "Rank" column.

    Date            Id    Rank
    --------------  ----  ----
    null            7     null
    null            1     null
    null            9     null
    null            2     null  
    null            8     null
    11/1/2017       null  null
    11/4/2017       3     null
    11/5/2017       null  null
    11/12/2017      10    null
    11/13/2017      null  null
    

    Then seed Rank to the Id:

    UPDATE SortDemo SET Rank = Id
    WHERE Id IS NOT NULL
    
    Date            Id    Rank
    --------------  ----  ----
    null            7     7
    null            1     1 
    null            9     9 
    null            2     2 
    null            8     8 
    11/1/2017       null  null
    11/4/2017       3     3
    11/5/2017       null  null 
    11/12/2017      10    10
    11/13/2017      null  null
    

    Then for the remaining items items with a Date, we need to assign it to the "next" rank:

    Date            Id    Rank
    --------------  ----  ----
    null            7     7
    null            1     1 
    null            9     9 
    null            2     2 
    null            8     8 
    11/1/2017       null  null  <-- 3
    11/4/2017       3     3     
    11/5/2017       null  null  <-- 10
    11/12/2017      10    10    
    11/13/2017      null  null  <-- ?
    

    with

    UPDATE SortDemo SET Rank = (
          SELECT MIN(Rank) FROM SortDemo s2 
          WHERE s2.Date >= SortDemo.Date)
    WHERE SortDemo.Date IS NOT NULL
    
    Date            Id    Rank
    --------------  ----  ----
    null            7     7
    null            1     1 
    null            9     9 
    null            2     2 
    null            8     8 
    11/1/2017       null  3    <--
    11/4/2017       3     3     
    11/5/2017       null  10   <--
    11/12/2017      10    10    
    11/13/2017      null  null <-- ?
    

    There's also the edge case where there are no items "after" us; which we can fix by going backwards to one the highest previous rank:

    UPDATE SortDemo SET Rank = (
          SELECT MAX(Rank) FROM SortDemo s2 
          WHERE s2.Date <= SortDemo.Date)
    WHERE SortDemo.Date IS NOT NULL
    
    Date            Id    Rank
    --------------  ----  ----
    null            7     7
    null            1     1 
    null            9     9 
    null            2     2 
    null            8     8 
    11/1/2017       null  3
    11/4/2017       3     3     
    11/5/2017       null  10
    11/12/2017      10    10    
    11/13/2017      null  10 <--10
    

    And we're done

    We can now sort by surrogate Rank, and break ties by sorting by Date:

    SELECT * FROM SortDemo
    ORDER BY Rank, Date
    
    Date            Id    Rank
    --------------  ----  ----
    null            1     1 
    null            2     2 
    11/1/2017       null  3
    11/4/2017       3     3     
    null            7     7
    null            8     8 
    null            9     9 
    11/5/2017       null  10
    11/12/2017      10    10    
    11/13/2017      null  10
    

    Solution complete. Sql Fiddle

    The answer is in escrow until Monday, so i can give people the opportunity to earn reputation for solving a unique problem.