sqlsql-serversortingalphanumeric

Sort strings logically in SQL with digit parts sorted as numerical values


I've got a table with codes such as 11, 101 ST-1001, ST-100 etc. and I want to sort the data so that the integer parts are sorted as their integer values. But the problem it's mixed up with string/varchar parts as well.

So, is it possible in MS-SQL to sort codes "logically" in the same way as the Windows StrCmpLogicalW dll function?

I've tried sorting it using a CASE statement, but I think this will become too complex if it has to take into account all the different string prefixes. See example code below.

CREATE TABLE metaforms (
    Code varchar(10),
    Description varchar(25),
    Variables integer
);
GO

INSERT INTO metaforms (
    Code,
    Description,
    Variables
) VALUES
('1', 'testform', 143),
('10015', 'measurement', 16),
('1020006510', 'Nutrition history', 50),
('1010010006', 'Translator', 8),
('10200310', 'Delivery Details: I', 13),
('10200320', 'Delivery Details: II', 12),
('10200330', 'Delivery Details: III', 12),
('1020100001', 'Comfort and Hygiene', 10),
('1020100025', 'Car Seat Evaluation', 7),
('1020100053', 'Length Weight', 7),
('AMB-100', 'AMB SF SA-DIABETES', 5),
('AMB-34', 'AMB SF SA-CHEST', 5),
('AMB-99', 'AMB SF SA-COUGH', 5),
('LQF-3912', 'DLBX-M DIAGNOSTICS (3912)', 21),
('LQF-40', 'DLBX-M DIAGNOSTICS (40)', 24),
('LQF-588', 'DLBX-M DIAGNOSTICS (588)', 34),
('ST-1010', 'Standard form 1010', 9),
('ST-200', 'Standard form 200', 9),
('ST-35', 'Standard form 35', 9);

-- try sorting
select * from metaforms
order by case when isnumeric(code)=1 then right('000000000000'+code, 12) else code end;

select * from metaforms order by LEN(code), code;

select * from metaforms
order by case when isnumeric(code)=1 then LEN(code) end, code;

The select statements in the code above sorts the numeric values correctly, but not the codes that also have a string part.

So the desired sorted result should be like this:

Code
-----------
1
10015
10200310
10200320
10200330
1010010006
1020006510
1020100001
1020100025
1020100053
AMB-34
AMB-99
AMB-100
LQF-40
LQF-588
LQF-3912
ST-35
ST-200
ST-1010

Solution

  • If all your data indeed matches the shown pattern ^([A-Z]*-|)[0-9]+$, we can take advantage of the - also denoting negative numbers. So instead of fucking around with stuff like “charindex + 1”, we just take the absolute value.

    select *
    from #metaforms
    order by left(code, charindex('-', Code))
        , abs(substring(code, charindex('-', Code), 999999999))
    

    Alternatively you could do this. Probably better actually?

    select *
    from #metaforms
    order by left(code, charindex('-', Code))
        , cast(right(code, len(code) - charindex('-', Code)) as int)
    

    Output:

    1
    10015
    10200310
    10200320
    10200330
    1010010006
    1020006510
    1020100001
    1020100025
    1020100053
    AMB-34
    AMB-99
    AMB-100
    LQF-40
    LQF-588
    LQF-3912
    ST-35
    ST-200
    ST-1010