pythoniterable-unpacking

Why does Python tuple unpacking work on sets?


Sets don't have a deterministic order in Python. Why then can you do tuple unpacking on a set in Python?

To demonstrate the problem, take the following in CPython 3.10.12:

a, b = {"foo", "bar"}  # sets `a = "bar"`, `b = "foo"`
a, b = {"foo", "baz"}  # sets `a = "foo"`, `b = "baz"`

I recognize that the literal answer is that Python tuple unpacking works on any iterable. For example, you can do the following:

def f():
    yield 1
    yield 2

a, b = f()

But why is there not a check used by tuple unpacking that the thing being unpacked has deterministic ordering?


Solution

  • The core "why" is: Because all features start at -100 points, and nobody thought it was worth preventing sets from being used in this context.

    Every new feature costs developer resources to write it, write tests for it, code review it, and then maintain it forever. There has to be a significant benefit to the feature to justify it. "Preventing people from doing something that is potentially useful in niche contexts to avoid (possibly accidental) misuse in other niche contexts" is essentially neutral on pros and cons.

    You could propose a feature that would enable this. If someone came up with a significant benefit that would not only cancel out the -100 points all features start at, but also cancel out the negative points applied because this would definitely break existing code in use right now, then they might deprecate (with a warning) iterable unpacking using sets and other unordered iterables, and in a year or three some new version of Python could eventually forbid it. I don't see it happening.

    Fundamentally:

    1. It is useful for sets to be iterable (even if unordered iteration is bad in your opinion, sorted(someset) relies on being able to iterate sets to produce the list that it then sorts). So that's not going away.
    2. Iterable unpacking applies to all iterables; you'd need to special-case unordered iterables to explicitly block it.
    3. You'll never prevent all forms of the misuse you seem to dislike (something as simple as a, b = list(theset) will prevent the "misuse" from being detected)
    4. There are always going to be valid use cases this needlessly blocks, e.g. checking for a single element set by unpacking, [obj] = theset (with try/except used to handle when it's more than one), or processing elements in a destructive fashion one at a time, when you just need one arbitrary element at a time, e.g. first, *collection = collection (where collection starts as set but becomes a list as a side-effect here).
    5. Even if they put in an explicit means of detecting unordered iterables, e.g. an __unordered__ attribute on the class with C level support so it can be checked efficiently, that's still slowing down a highly optimized code path for little to no benefit.

    So it's a feature that will never handle all cases of "misuse", slows down uses to prevent the misuse, breaks existing code, and is only an arguable benefit in the first place. So they haven't done it, and almost certainly never will do it.