I am trying to sort a list based on the frequency of it's elements. But I get two different answers when the list is sorted and list is un-sorted. Please see the code segment below.
Could someone explain the cause. Thank you.
from collections import Counter
l = [1,1,0,0,5,2,5,5,3,4,33,0]
# Stores the frequency of each element as {elem: freq}
c = Counter(l)
# Sorting the list based on the frequency of the elements
lst1 = sorted(l, key=lambda x: -c[x])
# lst1: [0, 0, 5, 5, 5, 0, 1, 1, 2, 3, 4, 33]
l.sort()
# Sorting the list based on the frequency of the elements
lst2 = sorted(l, key=lambda x: -c[x])
# lst2: [0, 0, 0, 5, 5, 5, 1, 1, 2, 3, 4, 33]
Both results are correct.
Since both occurrences, c[0]
and c[5]
, evaluate to 3
(in this case), and that number alone is used as the sorting key in both cases, the sorting algorithm will treat both integers as "equal" and sort them depending on the order it encountered them only.
Looking at the documentation of sorted
tells us that this is a feature of the sorting algorithm:
The built-in sorted() function is guaranteed to be stable. A sort is stable if it guarantees not to change the relative order of elements that compare equal
If you'd like to sort by the integer's value, in case both occurrences are the same, you can extend the sorting function to return a tuple, e.g.:
lst = sorted(l, key=lambda x: (-c[x], x))
# lst: [0, 0, 0, 5, 5, 5, 1, 1, 2, 3, 4, 33]