sql-servert-sql

Why VARCHAR data sorted as if there was a space appended on the end


Using SQL Server 2017, I encountered a query that was sorting on a VARCHAR column returning results in a different order than I expected, namely a string followed by a tab came before just the bare string. Further investigation showed that it was behaving as if the bare string had a space on the end.

Experimenting further, given the following:

SELECT * FROM (VALUES
    (1, 'a' + CHAR(33)),
    (2, 'a   '),
    (3, 'a  '),
    (4, 'a '),
    (5, 'a' + CHAR(31)),
    (6, 'a')
)v(rownum, text)
ORDER BY text

I would have expected the results to be:

rownum text
6 'a'
5 'a(31)'
4 'a(sp)'
3 'a(sp)(sp)'
2 'a(sp)(sp)(sp)'
1 'a(33)'

But instead it was:

rownum text
5 'a(31)'
3 'a(sp)(sp)'
4 'a(sp)'
2 'a(sp)(sp)(sp)'
6 'a'
1 'a(33)'

So it seems as if it treats 'a'+(no spaces) and 'a'+(any amount of spaces) as if they were all equivalent. What is going on here?


Solution

  • Because SQL Server follows ANSI standard string comparison rules, so 'a' and 'a ' are equal. eg

    SELECT *, rank() over (order by text) rank 
    FROM (VALUES
        (1, 'a' + CHAR(33)),
        (2, 'a   '),
        (3, 'a  '),
        (4, 'a '),
        (5, 'a' + CHAR(31)),
        (6, 'a')
    )v(rownum, text)
    ORDER BY text
    

    outputs

    rownum      text rank
    ----------- ---- --------------------
    5           a   1
    3           a    2
    4           a    2
    2           a    2
    6           a    2
    1           a!   6
    
    (6 rows affected)
    
    

    And the order in which rows are returned that are equal in the ORDER BY clause is undefined, and plan-dependent.