So I have a list with several dictionaries, they all have the same keys. Some dictionaries are the same but one value is different. How could I merge them into 1 dictionary having that different values as array?
Let me give you an example:
let's say I have this dictionaries
[{'a':1, 'b':2,'c':3},{'a':1, 'b':2,'c':4},{'a':1, 'b':3,'c':3},{'a':1, 'b':3,'c':4}]
My desired output would be this:
[{'a':1, 'b':2,'c':[3,4]},{'a':1, 'b':3,'c':[3,4]}]
I've tried using for
and if
nested, but it's too expensive and nasty, and I'm sure there must be a better way. Could you give me a hand?
How could I do that for any kind of dictionary assuming that the amount of keys is the same on the dictionaries and knowing the name of the key to be merged as array (c
in this case)
thanks!
Use a collections.defaultdict
to group the c
values by a
and b
tuple keys:
from collections import defaultdict
lst = [
{"a": 1, "b": 2, "c": 3},
{"a": 1, "b": 2, "c": 4},
{"a": 1, "b": 3, "c": 3},
{"a": 1, "b": 3, "c": 4},
]
d = defaultdict(list)
for x in lst:
d[x["a"], x["b"]].append(x["c"])
result = [{"a": a, "b": b, "c": c} for (a, b), c in d.items()]
print(result)
Could also use itertools.groupby
if lst
is already ordered by a
and b
:
from itertools import groupby
from operator import itemgetter
lst = [
{"a": 1, "b": 2, "c": 3},
{"a": 1, "b": 2, "c": 4},
{"a": 1, "b": 3, "c": 3},
{"a": 1, "b": 3, "c": 4},
]
result = [
{"a": a, "b": b, "c": [x["c"] for x in g]}
for (a, b), g in groupby(lst, key=itemgetter("a", "b"))
]
print(result)
Or if lst
is not ordered by a
and b
, we can sort by those two keys as well:
result = [
{"a": a, "b": b, "c": [x["c"] for x in g]}
for (a, b), g in groupby(
sorted(lst, key=itemgetter("a", "b")), key=itemgetter("a", "b")
)
]
print(result)
Output:
[{'a': 1, 'b': 2, 'c': [3, 4]}, {'a': 1, 'b': 3, 'c': [3, 4]}]
For a more generic solution for any amount of keys:
def merge_lst_dicts(lst, keys, merge_key):
groups = defaultdict(list)
for item in lst:
key = tuple(item.get(k) for k in keys)
groups[key].append(item.get(merge_key))
return [
{**dict(zip(keys, group_key)), **{merge_key: merged_values}}
for group_key, merged_values in groups.items()
]
print(merge_lst_dicts(lst, ["a", "b"], "c"))
# [{'a': 1, 'b': 2, 'c': [3, 4]}, {'a': 1, 'b': 3, 'c': [3, 4]}]