pythondictionarysetdictview

Why does dict unioned with dict.keys() return a set?


As I originally expected, the union of a dict and a set gives TypeError:

>>> {1:2} | {3}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for |: 'dict' and 'set'

However, surprisingly, the union of a dict and dict.keys() returns a set:

>>> {1:2} | {3:4}.keys()
{1, 3}

set.union(dict) also has this behavior:

>>> {3}.union({1:2})
{1, 3}

But set | dict does not, and behaves just like dict | set:

>>> {3} | {1:2}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for |: 'set' and 'dict'

What's going on here? Why is taking the union of a dict and set allowed in some cases, but not in others, and why does it return a set of the keys in the cases where it is allowed?


Solution

  • Dictionary views are "set-like", but unlike set, they don't have the (type flexible) named methods (like .union as in your example), they just have operator overloads which remain type-flexible (since the named methods don't exist).

    Being type flexible, they work with any iterable as the other operand, and dicts are iterables of their keys (list({'a': 1, 'b': 2}) is ['a', 'b']), so the values are ignored in the view operation. It's not that dicts are specially accepted here, they're just being treated like any other iterable (you could | a dictionary view with a dict, list, tuple, range, generator, or file-like object, and they'd all work, assuming hashable contents, and produce a set as the result).

    It's not as bad for views to be more flexible because they're not intended to preserve their own type after the operation, they're expected to produce set outputs. set's out-of-place operators are more strict because they don't want to implicitly give precedence to the type of the left or right hand side when determining the type of the output (they don't want set OP nonset to leave any doubt as to whether the result is of type set or nonset, and they don't want to make it possible for nonset OP set to behave differently). Since dictionary views aren't preserving types anyway, they decided to go with a more liberal design for their operator overloads; the result of view OP nonview and nonview OP view is consistent, and it's always a set.

    The documented reason for supporting these specific operations is to match the features of collections.abc.Set:

    For set-like views, all of the operations defined for the abstract base class collections.abc.Set are available (for example, ==, <, or ^).

    and for some reason, collections.abc.Set doesn't require any of the named methods (aside from isdisjoint), only the operator overloads.

    Note: I'm not agreeing with this (I'd have preferred the views only have their operators work with other views and with set/frozenset, and that the views have the named methods as well for consistency), but it's too late to change, so it is how it is.

    As for the set methods being type flexible, that was more of an intentional choice. Operators don't give a strong impression that one of the operands is more important, while methods necessarily do (the thing you called it on is obviously more important than the arguments). So the methods accept arbitrary iterables (and in fact, can accept more than one iterable, as in {1, 2, 3}.union(range(5), range(10, 15))) and return the type of the object they were called on, while the operators insist that the types agree to avoid surprises.