I am trying to sort a list by frequency of its elements.
>>> a = [5, 5, 4, 4, 4, 1, 2, 2]
>>> a.sort(key=a.count)
>>> a
[5, 5, 4, 4, 4, 1, 2, 2]
a
is unchanged. However:
>>> sorted(a, key=a.count)
[1, 5, 5, 2, 2, 4, 4, 4]
Why does this method not work for .sort()
?
What you see is the result of a certain CPython implementation detail of list.sort
. Try this again, but create a copy of a
first:
a.sort(key=a.copy().count)
a
# [1, 5, 5, 2, 2, 4, 4, 4]
.sort
modifies a
internally, so a.count
is going to produce un-predictable results. This is documented as an implementation detail.
What copy
call does is it creates a copy of a
and uses that list's count
method as the key. You can see what happens with some debug statements:
def count(x):
print(a)
return a.count(x)
a.sort(key=count)
[]
[]
[]
...
a
turns up as an empty list when accessed inside .sort
, and [].count(anything)
will be 0
. This explains why the output is the same as the input - the predicates are all the same (0
).
OTOH, sorted
creates a new list, so it doesn't have this problem.
If you really want to sort by frequency counts, the idiomatic method is to use a Counter
:
from collections import Counter
a.sort(key=Counter(a).get)
a
# [1, 5, 5, 2, 2, 4, 4, 4]