pythonnumpymatrix-indexing

Explanation of numpy indexing ndarray[(4, 2), (5, 3)]


Question

Please help understand the design decision or rational of the Numpy indexing with tuple (i, j) into a ndarray.

Background

When the index is a single tuple (4, 2), then (i=row, j=column).

shape = (6, 7)
X = np.zeros(shape, dtype=int)
X[(4, 2)] = 1
X[(5, 3)] = 1
print("X is :\n{}\n".format(X))
---
X is :
[[0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0]
 [0 0 1 0 0 0 0]    <--- (4, 2)
 [0 0 0 1 0 0 0]]   <--- (5, 3)

However, when the indices are multiple tuples (4, 2), (5, 3), then (i=row, j=row) for (4, 2) and (i=column, j=column) for (5, 3).

shape = (6, 7)
Y = np.zeros(shape, dtype=int)
Y[(4, 2), (5, 3)] = 1
print("Y is :\n{}\n".format(Y))
---
Y is :
[[0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0]
 [0 0 0 1 0 0 0]    <--- (2, 3)
 [0 0 0 0 0 0 0]
 [0 0 0 0 0 1 0]    <--- (4, 5)
 [0 0 0 0 0 0 0]]

It means you are constructing a 2d array R, such that R=A[B, C]. This means that the value for rij=abijcij.

So it means that the item located at R[0,0] is the item in A with as row index B[0,0] and as column index C[0,0]. The item R[0,1] is the item in A with row index B[0,1] and as column index C[0,1], etc.

multi_index: A tuple of integer arrays, one array for each dimension.

Why not always (i=row, j=column)? What will happen if it is always (i=row, j=column)?


Updated

With the answers from Akshay and @DaniMesejo, understood:

X[
  (4),    # dimension 2 indices with only 1 element
  (2)     # dimension 1 indices with only 1 element
] = 1

Y[
  (4, 2, ...), # dimension 2 indices 
  (5, 3, ...)  # dimension 1 indices (dimension 0 is e.g. np.array(3) whose shape is (), in my understanding)
] = 1

Solution

  • It's quite easy to understand how it works (and the motivation behind this design decision).

    Numpy stores its ndarrays as contiguous blocks of memory. Each element is stored in a sequential manner every n bytes after the previous.

    (images referenced from this excellent SO post)

    So if your 3D array looks like this -

    enter image description here

    Then in memory its stores as -

    enter image description here

    When retrieving an element (or a block of elements), NumPy calculates how many strides (bytes) it needs to traverse to get the next element in that direction/axis. So, for the above example, for axis=2 it has to traverse 8 bytes (depending on the datatype) but for axis=1 it has to traverse 8*4 bytes, and axis=0 it needs 8*8 bytes.

    With this in mind, let's look at what you are trying to do.

    print(X)
    print(X.strides)
    
    [[0 0 0 0 0 0 0]
     [0 0 0 0 0 0 0]
     [0 0 0 0 0 0 0]
     [0 0 0 0 0 0 0]
     [0 0 1 0 0 0 0]
     [0 0 0 1 0 0 0]]
    
    #Strides (bytes) required to traverse in each axis.
    (56, 8)
    

    For your array, to get the next element in axis=0, we need to traverse 56 bytes, and for the next element in axis=1, we need 8 bytes.

    When you are indexing (4,2), NumPy is going 56*4 bytes in axis=0 and 8*2 bytes in axis=1 to access that. Similarly, if you want to access (4,2) and (5,3), it will have to access 56*(4,5) in axis=0 and 8*(2,3) in axis=1.

    This is why the design is what it is because it aligns with how NumPy actually indexes elements using strides.

    X[(axis0_indices), (axis1_indices), ..]
    
    X[(4, 5), (2, 3)] #(row indices), (column indices)
    
    array([1, 1])
    

    With this design, it's easy to scale to higher dimension tensors (say, 8-dimensional arrays) as well! If you were mentioning each index tuple separately, it will require elements * dimension number of computations to fetch those. While with this design, it can just broadcast the stride values to the tuple for each axis and fetch these much faster!