linuxsortinggnugnu-sort

GNU sort - What is the default algorithm used for comparison?


I need help understanding the default algorithm for GNU's sort. I assumed it did a lexicographic sort, however I found out some behavior that does not correspond to that, as an example take the following strings:

alex.
alex.a
alex.Z
alexa
alex0
alexZ
alex.~
alex
alex.|
alex.}
alex.abc

And sort them on a shell using sort like echo 'stuff' | sort

This is the result I get:

alex
alex.
alex.~
alex.|
alex.}
alex0
alexa
alex.a
alex.abc
alexZ
alex.Z

And I can't figure out why alex0 and alexa appear in between alex.} and alex.a

Can someone explain this to me?


Solution

  • Sorting by and large depends on LOCALE settings:

    $ sort sort 
    alex
    alex.
    alex.~
    alex.|
    alex.}
    alex0
    alexa
    alex.a
    alex.abc
    alexZ
    alex.Z
    $ LC_ALL=C sort sort
    alex
    alex.
    alex.Z
    alex.a
    alex.abc
    alex.|
    alex.}
    alex.~
    alex0
    alexZ
    alexa