pythondictionarysetdictview

Inconsistent behaviour between dict.items and dict.values


Note: code examples in python3, but the question stands for python2 as well (replacing .keys with .viewkeys, etc)

dict objects provide view methods which (sometimes) support set operations:

>>> {'a': 0, 'b': 1}.keys() & {'a'}
{'a'}
>>> {'a': 0, 'b': 1}.items() & {('a', 0)}
{('a', 0)}

But the values view does not support set operators:

>>> {'a': 0, 'b': 1}.values() & {0}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for &: 'dict_values' and 'set'

I understand that a dict value can be a non-hashable object, so it is not always possible to make a set of the values, however the same is true for dict.items, and here the set operations only fail at runtime for .items once there is an unhashable type in the dict, whereas the set operation for .values fails immediately.

The docs mention that Values views are not treated as set-like since the entries are generally not unique, but this doesn't seem to be a convincing reason - python doesn't for example prevent you from creating a set literal like {0, 0, 1, 2}.

What is the real reason for this inconsistency in behaviour?


Solution

  • If we were to treat the values as a set, you'd make the values dictionary view a very costly object to produce. You have to calculate the hash of all values before you can use it as a set; you really don't want to do this for a large dictionary, especially if you don't know up front if all values are even hashable.

    As such, this is far better left as an explicit operation; if you want to treat the values as a set, explicitly make it a set:

    values = set(yourdict.values())
    

    The dict.items() behaviour stems from the fact that we know up-front that the keys at least are unique, so each (key, value) pair is unique too; under the covers you can delegate membership testing to the keys dictionary view.

    But as soon as you use set operations on that (intersection, union, etc.) you are creating a new set object, not a dictionary view. And for such a set object both elements in the (key, value) pair must be hashable, as the generic set type cannot make the same assumption about the keys, nor could you maintain that constraint (as {'a': 0}.items() & {('a', 1)} is perfectly legal but leads to duplicate keys).