pythonlistdictionarylambdaheapq

make a list of the largest two and smallest two items of the same collection using the heapq module two functions nlargest() and nsmallest()


I want to make a list of the largest two and smallest two items in a list of dictionaries based on a dictiony key in this case 'price' - as shown below in the code - using the heapq module two functions nlargest() and nsmallest() i tried this code and it didn't work :

import heapq


portfolio = [
    {'name': 'FACEBOOK', 'shares': 100, 'price': 91.1},
    {'name': 'MICROSOFT', 'shares': 50, 'price': 543.22},
    {'name': 'APPLE', 'shares': 200, 'price': 21.09},
    {'name': 'AMAZON', 'shares': 35, 'price': 31.75}
]


cheap = heapq.nsmallest(2, portfolio)
expensive = heapq.nlargest(2, portfolio)
print('the two cheap stocks:', cheap)
print('the two expensive stocks:', expensive)

Then i finded a solution with lambda it did work ! but i didn't get it : this is the version with the solution piece included and it works but i didn't understand the use of lambda in this context:

import heapq


portfolio = [
    {'name': 'FACEBOOK', 'shares': 100, 'price': 91.1},
    {'name': 'MICROSOFT', 'shares': 50, 'price': 543.22},
    {'name': 'APPLE', 'shares': 200, 'price': 21.09},
    {'name': 'AMAZON', 'shares': 35, 'price': 31.75}
]
# the lambda solution
cheap = heapq.nsmallest(2, portfolio, key=lambda x: x['price'])
expensive = heapq.nlargest(2, portfolio, key=lambda x: x['price'])
print('the two cheap stocks:', cheap)
print('the two expensive stocks:', expensive)

this is the ouput it's exactly what i expected:

the two cheap stocks: [{'name': 'APPLE', 'shares': 200, 'price': 21.09}, {'name': 'AMAZON', 'shares': 35, 'price': 31.75}]
the two expensive stocks: [{'name': 'MICROSOFT', 'shares': 50, 'price': 543.22}, {'name': 'FACEBOOK', 'shares': 100, 'price': 91.1}]

I hope finding a good explanation of the use of lambda in the argument key of the function nlargest or nsmallest and thanks in advance.


Solution

  • When you have a list of elements and want, for example, to sort them (or to find the max etc), you have to do comparisons between different elements. Comparisons are operations like if x > 3:, if x == y, etc.

    Now, your elements are dictionaries. I.e., you have a list of dictionaries:

    lst = [{'a': 'FACEBOOK', 'b': 100, 'c': 91.1},
           {'a': 'MICROSOFT', 'b': 50, 'c': 543.22}]
    

    If you want to find the max of such list, you therefore need to compare dictionaries. Is {'a': 'FACEBOOK', 'b': 100, 'c': 91.1} bigger than {'a': 'MICROSOFT', 'b': 50, 'c': 543.22}? In your mind, it's not (because of the 'c' value). But the python interpreter does not know that you want to compare the dictionaries using the 'c' key. That's why you need the key parameter: it says which data have to be used to compare two elements of the list.

    In your case, you're telling the heapq class to compare two dictionaries of your heap by using the value associated to the 'price' key.

    For more details check out this question.