pythonlistsorting

Using list.count to sort a list in-place using .sort() does not work. Why?


I am trying to sort a list by frequency of its elements.

>>> a = [5, 5, 4, 4, 4, 1, 2, 2]
>>> a.sort(key=a.count)
>>> a
[5, 5, 4, 4, 4, 1, 2, 2]

a is unchanged. However:

>>> sorted(a, key=a.count)
[1, 5, 5, 2, 2, 4, 4, 4]

Why does this method not work for .sort()?


Solution

  • What you see is the result of a certain CPython implementation detail of list.sort. Try this again, but create a copy of a first:

    a.sort(key=a.copy().count)
    a
    # [1, 5, 5, 2, 2, 4, 4, 4]
    

    .sort modifies a internally, so a.count is going to produce un-predictable results. This is documented as an implementation detail.

    What copy call does is it creates a copy of a and uses that list's count method as the key. You can see what happens with some debug statements:

    def count(x):
        print(a)
        return a.count(x)
    
    a.sort(key=count)
    []
    []
    []
    ...
    

    a turns up as an empty list when accessed inside .sort, and [].count(anything) will be 0. This explains why the output is the same as the input - the predicates are all the same (0).

    OTOH, sorted creates a new list, so it doesn't have this problem.


    If you really want to sort by frequency counts, the idiomatic method is to use a Counter:

    from collections import Counter
    
    a.sort(key=Counter(a).get)
    a
    # [1, 5, 5, 2, 2, 4, 4, 4]