pythonpython-3.xmultiple-assignment

Python3 multiple assignment and memory address


After reading this and this, which are pretty similar to my question, I still cannot understand the following behaviour:

a = 257
b = 257
print(a is b) #False
a, b = 257, 257
print(a is b) #True

When printing id(a) and id(b) I can see that the variables, to which the values were assigned in separate lines, have different ids, whereas with multiple assignment both values have the same id:

a = 257
b = 257
print(id(a)) #139828809414512
print(id(b)) #139828809414224
a, b = 257, 257
print(id(a)) #139828809414416
print(id(b)) #139828809414416

But it's impossible to explain this behaviour by saying that multiple assignment of same values always creates pointers to the same id since:

a, b = -1000, -1000  
print(id(a)) #139828809414448
print(id(b)) #139828809414288

Is there a clear rule, which explains when the variables get the same id and when not?

edit

relevant info: The code in this question was run in interactive mode(ipython3)


Solution

  • This is due to a constant folding optimization in the bytecode compiler. When the bytecode compiler compiles a batch of statements, it uses a dict to keep track of the constants it's seen. This dict automatically merges any equivalent constants.

    Here's the routine responsible for recording and numbering constants (as well as a few related responsibilities):

    static int
    compiler_add_o(struct compiler *c, PyObject *dict, PyObject *o)
    {
        PyObject *t, *v;
        Py_ssize_t arg;
    
        t = _PyCode_ConstantKey(o);
        if (t == NULL)
            return -1;
    
        v = PyDict_GetItem(dict, t);
        if (!v) {
            arg = PyDict_Size(dict);
            v = PyInt_FromLong(arg);
            if (!v) {
                Py_DECREF(t);
                return -1;
            }
            if (PyDict_SetItem(dict, t, v) < 0) {
                Py_DECREF(t);
                Py_DECREF(v);
                return -1;
            }
            Py_DECREF(v);
        }
        else
            arg = PyInt_AsLong(v);
        Py_DECREF(t);
        return arg;
    }
    

    You can see that it only adds a new entry and assigns a new number if it doesn't find an equivalent constant already present. (The _PyCode_ConstantKey bit makes sure things like 0.0, -0.0, and 0 are considered inequivalent.)

    In interactive mode, a batch ends every time the interpreter has to actually run your command, so constant folding mostly doesn't happen across commands:

    >>> a = 1000
    >>> b = 1000
    >>> a is b
    False
    >>> a = 1000; b = 1000 # 1 batch
    >>> a is b
    True
    

    In a script, all top-level statements are one batch, so more constant folding happens:

    a = 257
    b = 257
    print a is b
    

    In a script, this prints True.

    A function's code gets its constants tracked separately from code outside the function, which limits constant folding:

    a = 257
    
    def f():
        b = 257
        print a is b
    
    f()
    

    Even in a script, this prints False.