pythonreferenceimmutabilitymutability

How do python references work? Why do lists share modifications done to other references while integers don't?


The following snippet:

a = [1,2,3,4,5]
b = a
b.append(6)

print(a)
print(b)

prints

[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 6]

so when modifying b, we also modify the list accessible through a.

However, with integers, this snippet:

a = 1
b = a
b +=1

print(a)
print(b)

prints

1
2

So here, a and b seem to not reference the same thing? Why is the value of a not 2 in the second snippet?


Solution

  • In Python, everything is an object. Everything is a name for an address (pointer) per the docs.

    On that page you can scroll down and find the following:

    Numeric objects are immutable; once created their value never changes

    Under that you'll see the int type defined, so it makes perfect sense your second example works.

    On the top of the same page, you'll find the following:

    Every object has an identity, a type and a value. An object’s identity never changes once it has been created; you may think of it as the object’s address in memory.

    Python behaves just like C and Java in that you cannot reassign where the pointer to a name points. Python, like Java, is also pass-by-value and doesn't have a pass-by-reference semantic.

    Looking at your first example:

    >>> a = 1
    >>> hex(id(a))
    '0x7ffdc64cd420'
    >>> b = a + 1
    >>> hex(id(b))
    '0x7ffdc64cd440'
    >>> print(a)
    1
    >>> print(b)
    2
    

    Here it is shown that the operation b = a + 1 leaves a at 1 and b is now 2. That's because int is immutable, names that point to the value 1 will always point to the same address:

    >>> a = 1
    >>> b = 2
    >>> c = 1
    >>> hex(id(a))
    '0x7ffdc64cd420'
    >>> hex(id(b))
    '0x7ffdc64cd440'
    >>> hex(id(c))
    '0x7ffdc64cd420'
    

    Now this only holds true for the values of -5 to 256 in the C implementation, so beyond that you get new addresses, but the mutability shown above holds. I've shown you the sharing of memory addresses for a reason. On the same page you'll find the following:

    Types affect almost all aspects of object behavior. Even the importance of object identity is affected in some sense: for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed. E.g., after a = 1; b = 1, a and b may or may not refer to the same object with the value one, depending on the implementation, but after c = []; d = [], c and d are guaranteed to refer to two different, unique, newly created empty lists. (Note that c = d = [] assigns the same object to both c and d.)

    So your example:

    >>> a = [1, 2, 3, 4, 5]
    >>> hex(id(a))
    '0x17292e1cbc8'
    >>> b = a
    >>> hex(id(b))
    '0x17292e1cbc8'
    

    I should be able to stop right here, its obvious that both a and b refer to the same object in memory at address 0x17292e1cbc8. Thats because the above is like saying:

    # Lets assume that `[1, 2, 3, 4, 5]` is 0x17292e1cbc8 in memory
    >>> a = 0x17292e1cbc8
    >>> b = a
    >>> print(b)
    '0x17292e1cbc8'
    

    Long and skinny? You're simply assigning a pointer to a new name, but both names point to the same object in memory! Note: This is not the same as a shallow copy because no external compound object is made.