After reading this and this, which are pretty similar to my question, I still cannot understand the following behaviour:
a = 257
b = 257
print(a is b) #False
a, b = 257, 257
print(a is b) #True
When printing id(a)
and id(b)
I can see that the variables, to which the values were assigned in separate lines, have different ids, whereas with multiple assignment both values have the same id:
a = 257
b = 257
print(id(a)) #139828809414512
print(id(b)) #139828809414224
a, b = 257, 257
print(id(a)) #139828809414416
print(id(b)) #139828809414416
But it's impossible to explain this behaviour by saying that multiple assignment of same values always creates pointers to the same id since:
a, b = -1000, -1000
print(id(a)) #139828809414448
print(id(b)) #139828809414288
Is there a clear rule, which explains when the variables get the same id
and when not?
edit
relevant info: The code in this question was run in interactive mode(ipython3)
This is due to a constant folding optimization in the bytecode compiler. When the bytecode compiler compiles a batch of statements, it uses a dict to keep track of the constants it's seen. This dict automatically merges any equivalent constants.
Here's the routine responsible for recording and numbering constants (as well as a few related responsibilities):
static int
compiler_add_o(struct compiler *c, PyObject *dict, PyObject *o)
{
PyObject *t, *v;
Py_ssize_t arg;
t = _PyCode_ConstantKey(o);
if (t == NULL)
return -1;
v = PyDict_GetItem(dict, t);
if (!v) {
arg = PyDict_Size(dict);
v = PyInt_FromLong(arg);
if (!v) {
Py_DECREF(t);
return -1;
}
if (PyDict_SetItem(dict, t, v) < 0) {
Py_DECREF(t);
Py_DECREF(v);
return -1;
}
Py_DECREF(v);
}
else
arg = PyInt_AsLong(v);
Py_DECREF(t);
return arg;
}
You can see that it only adds a new entry and assigns a new number if it doesn't find an equivalent constant already present. (The _PyCode_ConstantKey
bit makes sure things like 0.0
, -0.0
, and 0
are considered inequivalent.)
In interactive mode, a batch ends every time the interpreter has to actually run your command, so constant folding mostly doesn't happen across commands:
>>> a = 1000
>>> b = 1000
>>> a is b
False
>>> a = 1000; b = 1000 # 1 batch
>>> a is b
True
In a script, all top-level statements are one batch, so more constant folding happens:
a = 257
b = 257
print a is b
In a script, this prints True
.
A function's code gets its constants tracked separately from code outside the function, which limits constant folding:
a = 257
def f():
b = 257
print a is b
f()
Even in a script, this prints False
.