pythonbytecodecpython

Lists construction in Python Bytecode


This python code:

import dis
def f():
    a=[1,2,3]

dis.dis(f)

generates this output:

  2           0 RESUME                   0

  3           2 BUILD_LIST               0
              4 LOAD_CONST               1 ((1, 2, 3))
              6 LIST_EXTEND              1
              8 STORE_FAST               0 (a)
             10 LOAD_CONST               0 (None)
             12 RETURN_VALUE

I am confused about the array building process.

My guess is that:

Is this correct? (edit: no, the guess is incorrect)


Solution

  • Cpython executes its bytecodes in a stack-based virtual machine. Now let's start explaining the dis output line-by line.

    Here BUILD_LIST is the opcode and 0 is the oparg. This will create an empty list and push it onto the top of the stack.

    case TARGET(BUILD_LIST): {
        PyObject *list =  PyList_New(oparg); // create empty list
        if (list == NULL)
            goto error;
        // Skipped other parts of the code for brevity.
        PUSH(list); // push it onto the top of the stack
        DISPATCH();
    }
    

    After this operation the stack state will be

    stack
    list(a pointer to PyObject)

    This will pushes (1, 2, 3)(ie, co_consts[1]) to the top of the stack.

    case TARGET(LOAD_CONST): {
         PREDICTED(LOAD_CONST);
         PyObject *value = GETITEM(consts, oparg);
         Py_INCREF(value);
         PUSH(value);
         DISPATCH();
    }
    

    stack state has now changed to

    stack
    (1, 2, 3)
    list(a pointer to PyObject)

    This will first pop the iterable from the top of the stack(which is (1, 2, 3)). PEEK(oparg) will then get the opargth(which is 1) element of the stack without removing it(which is the list object).

    PEEK(n) expands to #define PEEK(n) (stack_pointer[-(n)])

    case TARGET(LIST_EXTEND): {
         PyObject *iterable = POP();
         PyObject *list = PEEK(oparg);
         PyObject *none_val = _PyList_Extend((PyListObject *)list, iterable); 
         // Skipped other parts of the code for brevity.
         Py_DECREF(none_val);
         Py_DECREF(iterable);
         DISPATCH();
    }
    

    The _PyList_Extend API extends the list with the iterable in-place

    Now the stack state would be:

    stack
    list(Now populated with 1, 2, 3)

    Stores STACK.pop() into the local variable a(co_varnames\[var_num\]).