I'd like to check whether a dictionary is a subset of another dictionary recursively. Let's assume both dictionaries having builtin types as items.
I've seen there is already a very old thread Python: Check if one dictionary is a subset of another larger dictionary trying to solve something similar but not quite... Because none of the answers living there will serve for my purposes I've decided to post my own solution in there but it is still not fully complete, the below function is working ok for almost all cases but it fails miserably in cases when the subset has values that don't exist in the superset, ie:
def is_subset(superset, subset):
for key, value in superset.items():
if key not in subset:
continue
if isinstance(value, dict):
if not is_subset(value, subset[key]):
return False
elif isinstance(value, str):
if not subset[key] in value:
return False
elif isinstance(value, list):
if not set(subset[key]) <= set(value):
return False
elif isinstance(value, set):
if not subset[key] <= value:
return False
else:
if not subset[key] == value:
return False
return True
superset = {'question': 'mcve', 'metadata': {}}
subset = {'question': 'mcve', 'metadata': {'author': 'BPL'}}
print(is_subset(superset, subset))
The function returns True but it should return False. So, how would you fix it?
Your code logic is upside down. Notice how you take each element in the superset and continue if they are not in the subset. What you want to do is take each element in the subset and check that they are in the superset.
Here is a fixed version of you code.
def is_subset(superset, subset):
for key, value in subset.items():
if key not in superset:
return False
if isinstance(value, dict):
if not is_subset(superset[key], value):
return False
elif isinstance(value, str):
if value not in superset[key]:
return False
elif isinstance(value, list):
if not set(value) <= set(superset[key]):
return False
elif isinstance(value, set):
if not value <= superset[key]:
return False
else:
if not value == superset[key]:
return False
return True
Here are some example of the function giving the correct result.
superset = {'question': 'mcve', 'metadata': {}}
subset = {'question': 'mcve', 'metadata': {'author': 'BPL'}}
is_subset(superset, subset) # False
superset = {'question': 'mcve', 'metadata': {'foo': {'bar': 'baz'}}}
subset = {'metadata': {'foo': {}}}
is_subset(superset, subset) # True
superset = {'question': 'mcve', 'metadata': {'foo': 'bar'}}
subset = {'question': 'mcve', 'metadata': {}, 'baz': 'spam'}
is_subset(superset, subset) # False