pythonpython-3.xcpython

How does `is` work in the case of ephemeral objects sharing the same memory address?


Note that this question might be (is?) specific to CPython.

Say you have some list, and check copies of the list for identity against each other:

>>> a=list(range(10))
>>> b,c=a[:],a[:]
>>> b is c
False
>>> id(b), id(c)
(3157888272304, 3157888272256)

No great shakes there. But if we do this in a more ephemeral way, things might seem a bit weird at first:

>>> a[:] is a[:]
False  # <- two ephemeral copies not the same object (duh)
>>> id(a[:]),id(a[:])
(3157888272544, 3157888272544)   # <- but two other ephemerals share the same id..? hmm....

...until we recognize what is probably going on here. I have not confirmed it by looking at the CPython implementation (I can barely read c++ so it would be a waste of time, to be honest), but it at least seems obvious that even though two objects have the same id, CPython is smart enough to know that they aren't the same object.

Assuming this is correct, my question is: what criteria is CPython using to determine whether the two ephemeral objects are the not the same object, given that they have the same id (presumably for efficiency reasons- see below)? Is it perhaps looking at the time it was marked to be garbage collected? The time it was created? Or something else...?

My theory on why they have the same id is that, likely, CPython knows an ephemeral copy of the list was already made and is waiting to be garbage collected, and it just efficiently re-uses the same memory location. It would be great if an answer could clarify/confirm this as well.


Solution

  • Two unmutable objects, sharing the same address, would, as you are concerned, be indistinguishable from each other.

    The thing is that when you do a[:] is a[:] both objetcts are not at the same address - in order for the identity operator is to compare both objects, both operands have to exist - so, there is still a reference to the object at the left hand side when the native code for is is actually run.

    On the other hand, when you do id(a[:]),id(a[:]) the object inside the parentheses on the first call is left without any references as soon as the id function call is done, and is destroyed, freeing the memory block to be used by the second a[:].