pythondata-structures

How can I efficiently map and update frequencies of nested items using dictionaries in Python?


I'm working on a small DSA practice problem involving frequency counting, but with a twist.
I have a list of tuples where each tuple represents a (category, item). Example:

data = [
    ("fruit", "apple"),
    ("fruit", "banana"),
    ("veg", "carrot"),
    ("fruit", "apple"),
    ("veg", "carrot"),
    ("veg", "tomato"),
]

I want to build a dictionary that looks like this:

{
    "fruit": {"apple": 2, "banana": 1},
    "veg": {"carrot": 2, "tomato": 1}
}

I can do this with nested loops or by checking keys manually, but my code feels messy.

Is there a cleaner or more Pythonic way to build and update a nested dictionary like this? Bonus points if it handles missing keys gracefully without throwing KeyErrors.


Solution

  • This is most readily accomplished using collections.defaultdict and collections.Counter, both of which are just specialized subclasses of dict:

    from collections import Counter, defaultdict
    
    data = [
        ("fruit", "apple"),
        ("fruit", "banana"),
        ("veg", "carrot"),
        ("fruit", "apple"),
        ("veg", "carrot"),
        ("veg", "tomato"),
    ]
    
    d = defaultdict(Counter)
    
    for category, item in data:
        d[category][item] += 1
    
    print(d)
    

    Prints:

    defaultdict(<class 'collections.Counter'>, {'fruit': Counter({'apple': 2, 'banana': 1}), 'veg': Counter({'carrot': 2, 'tomato': 1})})
    

    You could also use just a regular dict instance with method dict.setdefault:

    from collections import Counter
    
    data = [
        ("fruit", "apple"),
        ("fruit", "banana"),
        ("veg", "carrot"),
        ("fruit", "apple"),
        ("veg", "carrot"),
        ("veg", "tomato"),
    ]
    
    d = {}
    
    for category, item in data:
        d.setdefault(category, Counter())[item] += 1
    
    print(d)
    

    Prints:

    {'fruit': Counter({'apple': 2, 'banana': 1}), 'veg': Counter({'carrot': 2, 'tomato': 1})}