I'm working on a small DSA practice problem involving frequency counting, but with a twist.
I have a list of tuples where each tuple represents a (category, item). Example:
data = [
("fruit", "apple"),
("fruit", "banana"),
("veg", "carrot"),
("fruit", "apple"),
("veg", "carrot"),
("veg", "tomato"),
]
I want to build a dictionary that looks like this:
{
"fruit": {"apple": 2, "banana": 1},
"veg": {"carrot": 2, "tomato": 1}
}
I can do this with nested loops or by checking keys manually, but my code feels messy.
Is there a cleaner or more Pythonic way to build and update a nested dictionary like this? Bonus points if it handles missing keys gracefully without throwing KeyErrors.
This is most readily accomplished using collections.defaultdict and collections.Counter, both of which are just specialized subclasses of dict:
from collections import Counter, defaultdict
data = [
("fruit", "apple"),
("fruit", "banana"),
("veg", "carrot"),
("fruit", "apple"),
("veg", "carrot"),
("veg", "tomato"),
]
d = defaultdict(Counter)
for category, item in data:
d[category][item] += 1
print(d)
Prints:
defaultdict(<class 'collections.Counter'>, {'fruit': Counter({'apple': 2, 'banana': 1}), 'veg': Counter({'carrot': 2, 'tomato': 1})})
You could also use just a regular dict instance with method dict.setdefault:
from collections import Counter
data = [
("fruit", "apple"),
("fruit", "banana"),
("veg", "carrot"),
("fruit", "apple"),
("veg", "carrot"),
("veg", "tomato"),
]
d = {}
for category, item in data:
d.setdefault(category, Counter())[item] += 1
print(d)
Prints:
{'fruit': Counter({'apple': 2, 'banana': 1}), 'veg': Counter({'carrot': 2, 'tomato': 1})}