pythondictionarynested-listsdefaultdictdeep-diff

Python. Compare and merge lists of dictionaries(diversed) by similar items


I have two lists of dictionaries(lod). I want to compare and merge them if they have similar items(by keys 'id' and 'size'), and if not set some default value(0) with certain key('count') to the first lod.

lod1 = [{'id':1, 'size':1, 'colour':'a'}, {'id':1, 'size':2, 'colour':'ab'}, {'id':2, 'size':1, 'colour':'ab'}, {'id':2, 'size':2, 'colour':'ab'}]
lod2 = [{'id':1, 'size':1, 'count':1}, {'id':1, 'size':2, 'count':2}, {'id':2, 'size':1, 'count':3}]

Output:

merged = [{'id':1, 'size':1, 'colour':'a', 'count': 1}, {'id':1, 'size':2, 'colour':'ab', 'count': 2},  {'id':2, 'size':1, 'colour':'ab', 'count':3}, {'id':2, 'size':2, 'colour':'ab', 'count': 0}]

I draw your attention to {'id':2 , 'size':2, 'colour':'ab', 'count': 0}

There are a lot of similar questions and solutions about comparing and merging lods, but not the same. I know the way to do that but it looks too bulky, clumsy and inelegant.

My solution:

def merge(l1, l2):
    lod1_pairs = [(i['id'], i['size']) for i in l1]
    lod2_pairs = [(i['id'], i['size']) for i in l2]
    mismatched = [pair for pair in lod1_pairs if pair not in lod2_pairs] 
    # we get list of unique pairs of lod1 to set default value to 'count' key.
    
    merged = []
    for i in lod1:
        if (i['id'], i['size']) in mismatched:
            #immediately cut off and evaluate 'count' key for the mismatched keys
            temp_dict = i| {'count': 0} 
            merged.append(temp_dict)
        else:
            for j in lod2:
                    #straightly compare key values
                    if i['id'] == j['id'] and i['size'] == j['size']:
                        temp_dict = i | j #merge dicts
                        merged.append(temp_dict)
    return merged


lod1 = [{'id':1, 'size':1, 'colour':'a'}, {'id':1 , 'size':2, 'colour':'ab'}, {'id':2, 'size':1, 'colour':'ab'}, {'id':2, 'size':2, 'colour':'ab'}]
lod2 = [{'id':1, 'size':1, 'count':1}, {'id':1, 'size':2, 'count':2}, {'id':2, 'size':1, 'count':3}]
merged = merge(lod1,lod2)

I had an experience with defaultdict, and reviewed a DeepDiff lib, but didn't understand how i can use them here.


Solution

  • You can try:

    out = {(d["id"], d["size"]): {**d, "count": 0} for d in lod1}
    for d in lod2:
        t = d["id"], d["size"]
        if t in out:
            out[t]["count"] = d["count"]
        else:
            out[t] = d
    
    print(list(out.values()))
    

    Prints:

    [
        {"id": 1, "size": 1, "colour": "a", "count": 1},
        {"id": 1, "size": 2, "colour": "ab", "count": 2},
        {"id": 2, "size": 1, "colour": "ab", "count": 3},
        {"id": 2, "size": 2, "colour": "ab", "count": 0},
    ]