pythoncpython

Why does `dict(id=1, **{'id': 2})` sometimes raise `KeyError: 'id'` instead of a TypeError?


Normally, if you try to pass multiple values for the same keyword argument, you get a TypeError:

In [1]: dict(id=1, **{'id': 2})
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [1], in <cell line: 1>()
----> 1 dict(id=1, **{'id': 2})

TypeError: dict() got multiple values for keyword argument 'id'

But if you do it while handling another exception, you get a KeyError instead:

In [2]: try:
   ...:     raise ValueError('foo') # no matter what kind of exception
   ...: except:
   ...:     dict(id=1, **{'id': 2}) # raises: KeyError: 'id'
   ...: 
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [2], in <cell line: 1>()
      1 try:
----> 2     raise ValueError('foo') # no matter what kind of exception
      3 except:

ValueError: foo

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
Input In [2], in <cell line: 1>()
      2     raise ValueError('foo') # no matter what kind of exception
      3 except:
----> 4     dict(id=1, **{'id': 2})

KeyError: 'id'

What's going on here? How could a completely unrelated exception affect what kind of exception dict(id=1, **{'id': 2}) throws?

For context, I discovered this behavior while investigating the following bug report: https://github.com/tortoise/tortoise-orm/issues/1583

This has been reproduced on CPython 3.11.8, 3.10.5, and 3.9.5.


Solution

  • This looks like a Python bug.

    The code that's supposed to raise the TypeError works by detecting and replacing an initial KeyError, but this code doesn't work right. When the exception occurs in the middle of another exception handler, the code that should raise the TypeError fails to recognize the KeyError. It ends up letting the KeyError through, instead of replacing it with a TypeError.

    The bug appears to be gone on 3.12, due to changes in the exception implementation.


    Here's the deep dive, for the CPython 3.11.8 source code. Similar code exists on 3.10 and 3.9.

    As we can see by using the dis module to examine the bytecode for dict(id=1, **{'id': 2}):

    In [1]: import dis
    
    In [2]: dis.dis("dict(id=1, **{'id': 2})")
      1           0 LOAD_NAME                0 (dict)
                  2 LOAD_CONST               3 (())
                  4 LOAD_CONST               0 ('id')
                  6 LOAD_CONST               1 (1)
                  8 BUILD_MAP                1
                 10 LOAD_CONST               0 ('id')
                 12 LOAD_CONST               2 (2)
                 14 BUILD_MAP                1
                 16 DICT_MERGE               1
                 18 CALL_FUNCTION_EX         1
                 20 RETURN_VALUE
    

    Python uses the DICT_MERGE opcode to merge two dicts, to build the final keyword argument dict.

    The relevant part of the DICT_MERGE code is as follows:

                if (_PyDict_MergeEx(dict, update, 2) < 0) {
                    format_kwargs_error(tstate, PEEK(2 + oparg), update);
                    Py_DECREF(update);
                    goto error;
                }
    

    It uses _PyDict_MergeEx to attempt to merge two dicts, and if that fails (and raises an exception), it uses format_kwargs_error to try to raise a different exception.

    When the third argument to _PyDict_MergeEx is 2, that function will raise a KeyError for duplicate keys, inside the dict_merge helper function. This is where the KeyError comes from.

    Once the KeyError is raised, format_kwargs_error has the job of replacing it with a TypeError. It tries to do so with the following code:

        else if (_PyErr_ExceptionMatches(tstate, PyExc_KeyError)) {
            PyObject *exc, *val, *tb;
            _PyErr_Fetch(tstate, &exc, &val, &tb);
            if (val && PyTuple_Check(val) && PyTuple_GET_SIZE(val) == 1) {
    

    but this code is looking for an unnormalized exception, an internal way of representing exceptions that isn't exposed to Python-level code. It expects the exception value to be a 1-element tuple containing the key that the KeyError was raised for, instead of an actual exception object.

    Exceptions raised inside C code are usually unnormalized, but not if they occur while Python is handling another exception. Unnormalized exceptions cannot handle exception chaining, which occurs automatically for exceptions raised inside an exception handler. In this case, the internal _PyErr_SetObject routine will automatically normalize the exception:

        exc_value = _PyErr_GetTopmostException(tstate)->exc_value;
        if (exc_value != NULL && exc_value != Py_None) {
            /* Implicit exception chaining */
            Py_INCREF(exc_value);
            if (value == NULL || !PyExceptionInstance_Check(value)) {
                /* We must normalize the value right now */
    

    Since the KeyError has been normalized, format_kwargs_error doesn't understand what it's looking at. It lets the KeyError through, instead of raising the TypeError it's supposed to.


    On Python 3.12, things are different. The internal exception representation has been changed, so any raised exception is always normalized. Thus, the Python 3.12 version of format_kwargs_error looks for a normalized exception instead of an unnormalized exception, and if _PyDict_MergeEx has raised a KeyError, the code will recognize it:

        else if (_PyErr_ExceptionMatches(tstate, PyExc_KeyError)) {
            PyObject *exc = _PyErr_GetRaisedException(tstate);
            PyObject *args = ((PyBaseExceptionObject *)exc)->args;
            if (exc && PyTuple_Check(args) && PyTuple_GET_SIZE(args) == 1) {