pythondictionarymergeddictionaries

Merge several dictionaries creating array on different values


So I have a list with several dictionaries, they all have the same keys. Some dictionaries are the same but one value is different. How could I merge them into 1 dictionary having that different values as array?

Let me give you an example:

let's say I have this dictionaries

[{'a':1, 'b':2,'c':3},{'a':1, 'b':2,'c':4},{'a':1, 'b':3,'c':3},{'a':1, 'b':3,'c':4}]

My desired output would be this:

[{'a':1, 'b':2,'c':[3,4]},{'a':1, 'b':3,'c':[3,4]}]

I've tried using for and if nested, but it's too expensive and nasty, and I'm sure there must be a better way. Could you give me a hand?

How could I do that for any kind of dictionary assuming that the amount of keys is the same on the dictionaries and knowing the name of the key to be merged as array (c in this case)

thanks!


Solution

  • Use a collections.defaultdict to group the c values by a and b tuple keys:

    from collections import defaultdict
    
    lst = [
        {"a": 1, "b": 2, "c": 3},
        {"a": 1, "b": 2, "c": 4},
        {"a": 1, "b": 3, "c": 3},
        {"a": 1, "b": 3, "c": 4},
    ]
    
    d = defaultdict(list)
    for x in lst:
        d[x["a"], x["b"]].append(x["c"])
    
    result = [{"a": a, "b": b, "c": c} for (a, b), c in d.items()]
    
    print(result)
    

    Could also use itertools.groupby if lst is already ordered by a and b:

    from itertools import groupby
    from operator import itemgetter
    
    lst = [
        {"a": 1, "b": 2, "c": 3},
        {"a": 1, "b": 2, "c": 4},
        {"a": 1, "b": 3, "c": 3},
        {"a": 1, "b": 3, "c": 4},
    ]
    
    result = [
        {"a": a, "b": b, "c": [x["c"] for x in g]}
        for (a, b), g in groupby(lst, key=itemgetter("a", "b"))
    ]
    
    print(result)
    

    Or if lst is not ordered by a and b, we can sort by those two keys as well:

    result = [
        {"a": a, "b": b, "c": [x["c"] for x in g]}
        for (a, b), g in groupby(
            sorted(lst, key=itemgetter("a", "b")), key=itemgetter("a", "b")
        )
    ]
    
    print(result)
    

    Output:

    [{'a': 1, 'b': 2, 'c': [3, 4]}, {'a': 1, 'b': 3, 'c': [3, 4]}]
    

    Update

    For a more generic solution for any amount of keys:

    def merge_lst_dicts(lst, keys, merge_key):
        groups = defaultdict(list)
    
        for item in lst:
            key = tuple(item.get(k) for k in keys)
            groups[key].append(item.get(merge_key))
    
        return [
            {**dict(zip(keys, group_key)), **{merge_key: merged_values}}
            for group_key, merged_values in groups.items()
        ]
    
    print(merge_lst_dicts(lst, ["a", "b"], "c"))
    # [{'a': 1, 'b': 2, 'c': [3, 4]}, {'a': 1, 'b': 3, 'c': [3, 4]}]