pythonnumpyarray-broadcastingnumpy-indexing

NumPy advanced indexing using np.ix_() does not always result in the desired shape


I have a snippet of code that looks like this:

def slice_table(table, index_vector)
    to_index_product = []
    array_indices = []
    for i, index in enumerate(index_vector):
        if isinstance(index, list):
            to_index_product.append(index)
            array_indices.append(i)

    index_product = np.ix_(*to_index_product)
    for i, multiple in enumerate(index_product):
        index_vector[array_indices[i]] = multiple

    index_vector = tuple(index_vector)
    sliced_table = table[index_vector]
    return sliced_table

table is an np.ndarray of shape (6, 7, 2, 2, 2, 11, 9).

The purpose of the function is to pick out values that satisfy all the given indices. Since advanced NumPy indexing picks out separate value using one to one correspondence in the given index array instead of the desired intersections, I use np.nx_() to build matrices that would allow me to extract entire dimension values rather than just separate values. My initial test slice worked as desired, so I was content with the code:

index_vector = [5, [1, 2], 1, 1, 1, [0, 3, 7], slice(0, 9, None)]
# The actual `index_vector` is code-generated, hence the usage of `slice()` object
sliced_table = slice_table(table, index_vector)
sliced_table.shape  # (2, 3, 9)

In this example, every dimension except for the 2nd, 6th and 7th get an integer for an index and are thus absent from the slice. The shape of the slice is obvious from the vector because it has 2 integers as the second index, 3 integers as the 6th and a slice as the 7th index (meaning the entire length of the dimension is preserved). These examples also work:

index_vector = [5, [1, 2], 1, 1, 1, [0, 3, 7, 8], 1]
sliced_table = slice_table(table, index_vector)
sliced_table.shape  # (2, 4)

index_vector = [5, [1, 2], 1, 1, 1, [0, 3, 7, 8], [1, 3]]
sliced_table = slice_table(table, index_vector)
sliced_table.shape  # (2, 4, 2)

However, for the code below, the shape is not what I expect it to be:

index_vector = [
    slice(0, 6, None),
    [1, 2],
    slice(0, 2, None),
    slice(0, 2, None),
    slice(0, 2, None),
    [0, 3, 7, 8],
    1,
]
sliced_table = slice_table(table, index_vector)
sliced_table.shape  # (2, 4, 6, 2, 2, 2)

The shape I want it to be is (6, 2, 2, 2, 2, 4), but for some reason there's a reshuffling taking place and the shape is all wrong. It's a bit hard to say whether the elements are wrong, too, because most of table is filled with None, but from the non-NoneType objects that I get, it feels that I get the desired elements (I don't see any undesired ones, that is), just reshaped for some reason.

Why does this happen? Maybe I don't correctly understand how np.ix_() works and I can't just build a product of array indices and extract the desired matrices for each dimension one by one, like I do in my function? Or is there something I don't get about NumPy indexing?


Solution

  • As @hpaulj mentioned, advanced indexing forms the first subset of dimensions, followed by basic indices. Since slice objects trigger basic indexing, their dimensions are appended to the subslice made by advanced indices. An exerpt from the docs:

    The easiest way to understand a combination of multiple advanced indices may be to think in terms of the resulting shape. There are two parts to the indexing operation, the subspace defined by the basic indexing (excluding integers) and the subspace from the advanced indexing part. Two cases of index combination need to be distinguished:

    The advanced indices are separated by a slice, Ellipsis or newaxis. For example x[arr1, :, arr2].

    The advanced indices are all next to each other. For example x[..., arr1, arr2, :] but not x[arr1, :, 1] since 1 is an advanced index in this regard.

    In the first case, the dimensions resulting from the advanced indexing operation come first in the result array, and the subspace dimensions after that. In the second case, the dimensions from the advanced indexing operations are inserted into the result array at the same spot as they were in the initial array (the latter logic is what makes simple advanced indexing behave just like slicing).