I have two lists of dictionaries(lod). I want to compare and merge them if they have similar items(by keys 'id'
and 'size'
), and if not set some default value(0) with certain key('count'
) to the first lod.
lod1 = [{'id':1, 'size':1, 'colour':'a'}, {'id':1, 'size':2, 'colour':'ab'}, {'id':2, 'size':1, 'colour':'ab'}, {'id':2, 'size':2, 'colour':'ab'}]
lod2 = [{'id':1, 'size':1, 'count':1}, {'id':1, 'size':2, 'count':2}, {'id':2, 'size':1, 'count':3}]
Output:
merged = [{'id':1, 'size':1, 'colour':'a', 'count': 1}, {'id':1, 'size':2, 'colour':'ab', 'count': 2}, {'id':2, 'size':1, 'colour':'ab', 'count':3}, {'id':2, 'size':2, 'colour':'ab', 'count': 0}]
I draw your attention to {'id':2 , 'size':2, 'colour':'ab', 'count': 0}
There are a lot of similar questions and solutions about comparing and merging lods, but not the same. I know the way to do that but it looks too bulky, clumsy and inelegant.
My solution:
def merge(l1, l2):
lod1_pairs = [(i['id'], i['size']) for i in l1]
lod2_pairs = [(i['id'], i['size']) for i in l2]
mismatched = [pair for pair in lod1_pairs if pair not in lod2_pairs]
# we get list of unique pairs of lod1 to set default value to 'count' key.
merged = []
for i in lod1:
if (i['id'], i['size']) in mismatched:
#immediately cut off and evaluate 'count' key for the mismatched keys
temp_dict = i| {'count': 0}
merged.append(temp_dict)
else:
for j in lod2:
#straightly compare key values
if i['id'] == j['id'] and i['size'] == j['size']:
temp_dict = i | j #merge dicts
merged.append(temp_dict)
return merged
lod1 = [{'id':1, 'size':1, 'colour':'a'}, {'id':1 , 'size':2, 'colour':'ab'}, {'id':2, 'size':1, 'colour':'ab'}, {'id':2, 'size':2, 'colour':'ab'}]
lod2 = [{'id':1, 'size':1, 'count':1}, {'id':1, 'size':2, 'count':2}, {'id':2, 'size':1, 'count':3}]
merged = merge(lod1,lod2)
I had an experience with defaultdict, and reviewed a DeepDiff lib, but didn't understand how i can use them here.
You can try:
out = {(d["id"], d["size"]): {**d, "count": 0} for d in lod1}
for d in lod2:
t = d["id"], d["size"]
if t in out:
out[t]["count"] = d["count"]
else:
out[t] = d
print(list(out.values()))
Prints:
[
{"id": 1, "size": 1, "colour": "a", "count": 1},
{"id": 1, "size": 2, "colour": "ab", "count": 2},
{"id": 2, "size": 1, "colour": "ab", "count": 3},
{"id": 2, "size": 2, "colour": "ab", "count": 0},
]