pythonpython-3.xnumpyarray-indexing

The difference between double brackets vs single square brackets in python


I am trying to Sort the array by the second row. For n1, I use two square brackets [][] whereas for n2 I use a single square bracket [,]. I am trying to understand why do I get different results?

Thank you in advance.

import numpy

sampleArray = numpy.array([[34, 43, 73], [82, 22, 12], [53, 94, 66]])

print(sampleArray)

n1= sampleArray[:][sampleArray[1,:].argsort()]
n2 = sampleArray[:,sampleArray[1,:].argsort()]
print(n1)
print(n2)

Solution

  • At the interpreter level, you are doing the following:

    someindex = sampleArray[1, :].argsort()
    n1 = sampleArray.__getitem__(slice(None)).__getitem__(someindex)
    n2 = sampleArray.__getitem__((slice(None), someindex))
    

    The first call to __getitem__(slice(None)) in n1 is effectively a no-op: it just returns a view of the entire original array. The fact that it's technically a separate object won't affect the subsequent read. So n1 is an application of someindex along the rows.

    For n2, you pass in a tuple of indices (remember that it's commas that make a tuple, not parentheses). When given a tuple as the argument to __getitem__, numpy arrays split the elements along the leading dimensions. In this case, slice(None) selects all rows, while someindex applies along the different columns.

    Moral of the story: multidimensional numpy indices are not separable into a series of list-like indices. This is especially important for assignments: x[a, b] = c is x.__setitem__((a, b), c), while x[a][b] = c is x.__getitem__(a).__setitem__(b, c). The first case does what you generally expect, and modifies x, but can be difficult to construct, e.g., if a is a mask. The second case is often easier to construct indices for, but creates a temporary object that does not write back to the original array. Stack Overflow has its share of questions about this variant of the scenario.