pythondictionary

Recursive function to check dictionary is a subset of another dictionary


I'd like to check whether a dictionary is a subset of another dictionary recursively. Let's assume both dictionaries having builtin types as items.

I've seen there is already a very old thread Python: Check if one dictionary is a subset of another larger dictionary trying to solve something similar but not quite... Because none of the answers living there will serve for my purposes I've decided to post my own solution in there but it is still not fully complete, the below function is working ok for almost all cases but it fails miserably in cases when the subset has values that don't exist in the superset, ie:

def is_subset(superset, subset):
    for key, value in superset.items():
        if key not in subset:
            continue

        if isinstance(value, dict):
            if not is_subset(value, subset[key]):
                return False
        elif isinstance(value, str):
            if not subset[key] in value:
                return False
        elif isinstance(value, list):
            if not set(subset[key]) <= set(value):
                return False
        elif isinstance(value, set):
            if not subset[key] <= value:
                return False
        else:
            if not subset[key] == value:
                return False

    return True

superset = {'question': 'mcve', 'metadata': {}}
subset = {'question': 'mcve', 'metadata': {'author': 'BPL'}}

print(is_subset(superset, subset))

The function returns True but it should return False. So, how would you fix it?


Solution

  • Your code logic is upside down. Notice how you take each element in the superset and continue if they are not in the subset. What you want to do is take each element in the subset and check that they are in the superset.

    Here is a fixed version of you code.

    def is_subset(superset, subset):
        for key, value in subset.items():
            if key not in superset:
                return False
    
            if isinstance(value, dict):
                if not is_subset(superset[key], value):
                    return False
    
            elif isinstance(value, str):
                if value not in superset[key]:
                    return False
    
            elif isinstance(value, list):
                if not set(value) <= set(superset[key]):
                    return False
            elif isinstance(value, set):
                if not value <= superset[key]:
                    return False
    
            else:
                if not value == superset[key]:
                    return False
    
        return True
    

    Here are some example of the function giving the correct result.

    superset = {'question': 'mcve', 'metadata': {}}
    subset = {'question': 'mcve', 'metadata': {'author': 'BPL'}}
    
    is_subset(superset, subset) # False
    
    superset = {'question': 'mcve', 'metadata': {'foo': {'bar': 'baz'}}}
    subset = {'metadata': {'foo': {}}}
    
    is_subset(superset, subset) # True
    
    superset = {'question': 'mcve', 'metadata': {'foo': 'bar'}}
    subset = {'question': 'mcve', 'metadata': {}, 'baz': 'spam'}
    
    is_subset(superset, subset) # False