pythonsortingnatsort

python natsort sort strings recursively


I find that with natsort.natsorted the sorting order changes part-way through a string:

In [31]: import natsort as ns
In [32]: ns.natsorted(["01-08", "02-07", "01-06", "02-09"])
Out[32]: ['01-08', '01-06', '02-09', '02-07']

In this case, the behaviour I want is:

In [33]: sorted(["01-08", "02-07", "01-06", "02-09"])
Out[33]: ['01-06', '01-08', '02-07', '02-09']

Solution

  • Try this:

    ns.natsorted(["01-08", "02-07", "01-06", "02-09"], alg=ns.ns.INT | ns.ns.UNSIGNED)
    

    The problem is that natsorted is interpreting your strings incorrectly. This manually sets the algorithm to look for unsigned ints. Otherwise, it searches for signed ints, and that "-" causes problems (if you interpret "-08", for example, as -8, then the sorting makes sense).

    This is actually equivalent to versorted, which is just a shortcut for this algorithm, but I think it's better to explicitly write what you're doing, especially as versorted could change to be more applicable to versions in the future.