pythonpython-3.xdictionaryitertools-groupby

Group emails into TO & CC with itertools.groupby and convert it to a dictionary


I'd like to group emails by their domain and convert the result into a dictionary. So far I have figured out that itertools.groupby with a custom func will do that. It correctly assigns keys to each value, but when I try to create a dictionary only the last value is used when the values to be grouped are not continues.


import re
from itertools import groupby

{k: list(v) for k, v in groupby(["bar", "foo", "baz"], key=lambda x: "to" if re.search(r"^b", x) else "cc")}

This will produce {'to': ['baz'], 'cc': ['foo']} instead of {'to': ['bar', 'baz'], 'cc': ['foo']}.

How I can fix that?


Solution

  • You can use dict.setdefault OR collections.defaultdict(list) and extend in list like below.

    # from collections import defaultdict
    # dct = defaultdict(list)
    
    from itertools import groupby
    import re
    
    dct = {}
    for k, v in groupby(["awol", "bar", "foo", "baz"], 
                        key=lambda x: "to" if re.search(r"^b", x) else "cc"):
        dct.setdefault(k,[]).extend(list(v))
    
        # If you use 'dct = defaultdict(list)'. You can add item in 'list' like below
        # dct[k].extend(list(v))
    print(dct)
    

    {'cc': ['awol', 'foo'], 'to': ['bar', 'baz']}