pythonarraysnumpyindexingadvanced-indexing

Problem applying an operation by NumPy advanced indexing


I have two arrays by shape c = (2, 10, 2, 3) and p = (2, 5, 3). I want to pick the first vectors in the 3d dimension of c and minus it from each vectors in the corresponding row of p. It will be better understandable by this example, we can do this operation on the first row by c[0, :, None, 0] - p[0] or c[0, :, 0][:, None] - p[0], which will get result with shape (10, 5, 3). But I have confused how to apply this operation on all rows at once by advanced indexing, which result will be as shape (2, 10, 5, 3).

p = np.array([[[0., -0.05,  0.], [0., -0.05,  0.], [0.,  0.,  0.], [-0.05, -0.1,  0.05], [0.1,  0.2,  0.1]],
              [[0., -0.05,  0.], [0., -0.05,  0.], [0.,  0.,  0.], [-0.05, -0.1,  0.05], [0.1,  0.2,  0.1]]])

c = np.array([[[[ 0.05,  0. ,  0.05], [ 0.05, -0.1,  0.05]], [[ 0.05, -0.1,  0.05], [-0.05, -0.1,  0.05]],
               [[-0.05, -0.1,  0.05], [-0.05,  0. ,  0.05]], [[ 0.05,  0. , -0.05], [ 0.05, -0.1, -0.05]],
               [[ 0.05, -0.1, -0.05], [-0.05, -0.1, -0.05]], [[-0.05, -0.1, -0.05], [-0.05,  0. , -0.05]],
               [[ 0.05,  0. ,  0.05], [ 0.05,  0. , -0.05]], [[ 0.05, -0.1,  0.05], [ 0.05, -0.1, -0.05]],
               [[-0.05, -0.1,  0.05], [-0.05, -0.1, -0.05]], [[-0.05,  0. ,  0.05], [-0.05,  0. , -0.05]]],
              [[[ 0.05,  0.1,  0.05], [ 0.05,  0. ,  0.05]], [[ 0.05,  0. ,  0.05], [-0.05,  0. ,  0.05]],
               [[-0.05,  0. ,  0.05], [-0.05,  0.1,  0.05]], [[ 0.05,  0.1, -0.05], [ 0.05,  0. , -0.05]],
               [[ 0.05,  0. , -0.05], [-0.05,  0. , -0.05]], [[-0.05,  0. , -0.05], [-0.05,  0.1, -0.05]],
               [[ 0.05,  0.1,  0.05], [ 0.05,  0.1, -0.05]], [[ 0.05,  0. ,  0.05], [ 0.05,  0. , -0.05]],
               [[-0.05,  0. ,  0.05], [-0.05,  0. , -0.05]], [[-0.05,  0.1,  0.05], [-0.05,  0.1, -0.05]]]])

additional suggestion:
It is not the main issue, but will be helpful for me and I will be grateful for any alternative way to handle this:
I have broadcasted p from shape (5, 3) to (2, 5, 3) to be corresponded with 1st dimension of c. Is there a better way, without this broadcasting (may be just by advanced indexing instead), to handle this problem?


Solution

  • Let's start here:

    I have broadcasted p from shape (5, 3) to (2, 5, 3) to be corresponded with 1st dimension of c. Is there a better way, without this broadcasting (may be just by advanced indexing instead), to handle this problem?

    So, if I understand correctly, p should actually be (5, 3), you doubled it manually to make a (2, 5, 3) array. As you'll see, you do not have to do that, actually sticking to a (5, 3) array is going to make the whole thing much easier.

    Now when you do this:

    c[0, :, None, 0]
    

    What you get is a (10, 1, 3) array. You're on the right track here, as the second dimension of 1 is what you'll need for broadcasting, but again, you can do even simpler with

    c[:, :, None, 0]
    

    which will give you a (2, 10, 1, 3) array. Which is something you really much need in order to take advantage of numpy's broadcasting rules:

    When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing (i.e. rightmost) dimensions and works its way left. Two dimensions are compatible when: 1. they are equal, or 2. one of them is 1

    So, if we have p.shape == (5, 3) and c transformed so that c.shape == (2, 10, 1, 3), the matching will go right from left:

    Schematically:

     (2, 10, 1, 3)
      |  |   |  |
      |  |   V  V
      |  |  (5, 3)
      |  |   |  |
      V  V   V  V
     (2, 10, 5, 3) 
    

    Long explanation but actually the application is going to be very simple. So, let's try it:

    import numpy as np
    
    # let's keep this one at (5, 3)
    p = np.array([[0., -0.05,  0.], [0., -0.05,  0.], [0.,  0.,  0.], [-0.05, -0.1,  0.05], [0.1,  0.2,  0.1]])
    
    c = np.array([[[[ 0.05,  0. ,  0.05], [ 0.05, -0.1,  0.05]], [[ 0.05, -0.1,  0.05], [-0.05, -0.1,  0.05]],
                   [[-0.05, -0.1,  0.05], [-0.05,  0. ,  0.05]], [[ 0.05,  0. , -0.05], [ 0.05, -0.1, -0.05]],
                   [[ 0.05, -0.1, -0.05], [-0.05, -0.1, -0.05]], [[-0.05, -0.1, -0.05], [-0.05,  0. , -0.05]],
                   [[ 0.05,  0. ,  0.05], [ 0.05,  0. , -0.05]], [[ 0.05, -0.1,  0.05], [ 0.05, -0.1, -0.05]],
                   [[-0.05, -0.1,  0.05], [-0.05, -0.1, -0.05]], [[-0.05,  0. ,  0.05], [-0.05,  0. , -0.05]]],
                  [[[ 0.05,  0.1,  0.05], [ 0.05,  0. ,  0.05]], [[ 0.05,  0. ,  0.05], [-0.05,  0. ,  0.05]],
                   [[-0.05,  0. ,  0.05], [-0.05,  0.1,  0.05]], [[ 0.05,  0.1, -0.05], [ 0.05,  0. , -0.05]],
                   [[ 0.05,  0. , -0.05], [-0.05,  0. , -0.05]], [[-0.05,  0. , -0.05], [-0.05,  0.1, -0.05]],
                   [[ 0.05,  0.1,  0.05], [ 0.05,  0.1, -0.05]], [[ 0.05,  0. ,  0.05], [ 0.05,  0. , -0.05]],
                   [[-0.05,  0. ,  0.05], [-0.05,  0. , -0.05]], [[-0.05,  0.1,  0.05], [-0.05,  0.1, -0.05]]]])
    
    # now, it's really as simple as that
    result = c[:, :, None, 0] - p  # shape: (2, 10, 5, 3)