pythonlistmultiplicationshallow-copypython-datamodel

Multiply operator applied to list(data structure)


I'm reading How to think like a computer scientist which is an introductory text for "Python Programming".

I want to clarify the behaviour of multiply operator (*) when applied to lists.

Consider the function make_matrix

def make_matrix(rows, columns):
"""
  >>> make_matrix(4, 2)
  [[0, 0], [0, 0], [0, 0], [0, 0]]
  >>> m = make_matrix(4, 2)
  >>> m[1][1] = 7
  >>> m
  [[0, 0], [0, 7], [0, 0], [0, 0]]
"""
return [[0] * columns] * rows

The actual output is

[[0, 7], [0, 7], [0, 7], [0, 7]]

The correct version of make_matrix is :

def make_matrix(rows, columns):
"""
  >>> make_matrix(3, 5)
  [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
  >>> make_matrix(4, 2)
  [[0, 0], [0, 0], [0, 0], [0, 0]]
  >>> m = make_matrix(4, 2)
  >>> m[1][1] = 7
  >>> m
  [[0, 0], [0, 7], [0, 0], [0, 0]]
"""
matrix = []
for row in range(rows):
    matrix += [[0] * columns]
return matrix

The reason why first version of make_matrix fails ( as explained in the book at 9.8 ) is that

...each row is an alias of the other rows...

I wonder why

[[0] * columns] * rows

causes ...each row is an alias of the other rows...

but not

[[0] * columns]

i.e. why each [0] in a row is not an alias of other row element.


Solution

  • EVERYTHING in python are objects, and python never makes copies unless explicity asked to do so.

    When you do

    innerList = [0] * 10
    

    you create a list with 10 elements, all of them refering to the same int object 0.

    Since integer objects are immutable, when you do

    innerList[1] = 15
    

    You are changing the second element of the list so that it refers to another integer 15. That always works because of int objects immutability.

    That's why

    outerList = innerList * 5
    

    Will create a list object with 5 elements, each one is a reference to the same innerList just as above. But since list objects are mutable:

    outerList[2].append('something')
    

    Is the same as:

    innerList.append('something')
    

    Because they are two references to the same list object. So the element ends up in that single list. It appears to be duplicated, but the fact is that there is only one list object, and many references to it.

    By contrast if you do

    outerList[1] = outerList[1] + ['something']
    

    Here you are creating another list object (using + with lists is an explicit copy), and assigning a reference to it into the second position of outerList. If you "append" the element this way (not really appending, but creating another list), innerList will be unaffected.