I have a complex nested dictionary structure and I need to remove elements based on the values in a nested dictionary. My dictionary looks like this:
my_dict = {
'item1': {'name': 'Apple', 'price': 1.0, 'category': {'id': 1, 'name': 'Fruit'}},
'item2': {'name': 'Banana', 'price': 0.5, 'category': {'id': 1, 'name': 'Fruit'}},
'item3': {'name': 'Carrot', 'price': 0.75, 'category': {'id': 2, 'name': 'Vegetable'}},
'item4': {'name': 'Broccoli', 'price': 1.5, 'category': {'id': 2, 'name': 'Vegetable'}}
}
I want to filter this dictionary to only include items belonging to the 'Fruit' category. I tried the following code:
new_dict = {}
for key, value in my_dict.items():
if value['category']['name'] == 'Fruit':
new_dict[key] = value
print(new_dict)
This works, but I'm wondering if there's a more concise or Pythonic way to achieve this, perhaps using dictionary comprehension or a filtering function like filter()
.
A dictionary comprehension can be used to create dictionaries from arbitrary key and value expressions.
new_dict2 = {
key: value
for key, value in my_dict.items()
if value['category']['name'] == 'Fruit'
}
new_dict2 == new_dict
# True
filter()
The filter()
function is used to:
Construct an iterator from those elements of iterable for which function is true.
The dict.items()
returns an iterable where each element is a tuple of length 2. We can supply each item
to a lambda function, where item[0]
will be the key and item[1]
the value. filter()
returns an iterator of the tuples which match the condition. We can wrap this in dict()
to get a dictionary (in the same way that dict([("key1", "value1"), ("key2", "value2")])
returns {'key1': 'value1', 'key2': 'value2'}
).
new_dict3 = dict(
filter(
lambda item: item[1]['category']['name'] == 'Fruit',
my_dict.items()
)
)
new_dict3 == new_dict
# True
Achieving the nebulous goal of Pythonicness (Pythonicity?) is always somewhat subjective. I think a dictionary comprehension is clean and neat but it can be hard to see what it's doing, especially if the dict
is deeply nested or the condition is complex. It's probably clearest if you wrap it in an appropriately-named function so you can see what's going on. I've added type annotations for clarity:
def find_fruit(d: dict[str, dict]) -> dict[str, dict]:
def is_fruit(key: str, value: dict) -> bool:
return value["category"]["name"] == "Fruit"
return {key: value for key, value in d.items() if is_fruit(key, value)}
fruit_dict = find_fruit(my_dict)
new_dict == fruit_dict
# True
This is fundamentally the same as the first approach but easier on the eyes.