sortingpowershellstringcomparer

Powershell Sort of Strings with Underscores


The following list does not sort properly (IMHO):

$a = @( 'ABCZ', 'ABC_', 'ABCA' )
$a | sort
ABC_
ABCA
ABCZ

My handy ASCII chart and Unicode C0 Controls and Basic Latin chart have the underscore (low line) with an ordinal of 95 (U+005F). This is a higher number than the capital letters A-Z. Sort should have put the string ending with an underscore last.

Get-Culture is en-US

The next set of commands does what I expect:

$a = @( 'ABCZ', 'ABC_', 'ABCA' )
[System.Collections.ArrayList] $al = $a
$al.Sort( [System.StringComparer]::Ordinal )
$al
ABCA
ABCZ
ABC_

Now I create an ANSI encoded file containing those same 3 strings:

Get-Content -Encoding Byte data.txt
65 66 67 90 13 10  65 66 67 95 13 10  65 66 67 65 13 10
$a = Get-Content data.txt
[System.Collections.ArrayList] $al = $a
$al.Sort( [System.StringComparer]::Ordinal )
$al
ABC_
ABCA
ABCZ

Once more the string containing the underscore/lowline is not sorted correctly. What am I missing?


Edit:

Let's reference this example #4:

'A' -lt '_'
False
[char] 'A' -lt [char] '_'
True

Seems like both statements should be False or both should be True. I'm comparing strings in the first statement, and then comparing the Char type. A string is merely a collection of Char types so I think the two comparison operations should be equivalent.

And now for example #5:

Get-Content -Encoding Byte data.txt
65 66 67 90 13 10  65 66 67 95 13 10  65 66 67 65 13 10
$a = Get-Content data.txt
$b = @( 'ABCZ', 'ABC_', 'ABCA' )
$a[0] -eq $b[0]; $a[1] -eq $b[1]; $a[2] -eq $b[2];
True
True
True
[System.Collections.ArrayList] $al = $a
[System.Collections.ArrayList] $bl = $b
$al[0] -eq $bl[0]; $al[1] -eq $bl[1]; $al[2] -eq $bl[2];
True
True
True
$al.Sort( [System.StringComparer]::Ordinal )
$bl.Sort( [System.StringComparer]::Ordinal )
$al
ABC_
ABCA
ABCZ
$bl
ABCA
ABCZ
ABC_

The two ArrayList contain the same strings, but are sorted differently. Why?


Solution

  • Many moons later, let me attempt a comprehensive summary:

    By design:


    Unexpectedly: