pythonnumpyindices

np.where on a numpy MxN matrix but return M rows with indices where condition exists


I am trying to use np.where on a MxN numpy matrix, where I want to return the same number of M rows but the indices in each row where the element exists. Is this possible to do so? For example:

a = [[1 ,2, 2]
     [2, 3, 5]]

np.where(a == 2)

I would like this to return:

[[1, 2],
 [0]]

Solution

  • One option is to post-process the output of where, then split:

    a = np.array([[1, 2, 2],
                  [2, 3, 5]])
    
    i, j = np.where(a == 2)
    
    out = np.split(j, np.diff(i).nonzero()[0]+1)
    

    Alternatively, using a list comprehension:

    out = [np.where(x==2)[0] for x in a]
    

    Output:

    [array([1, 2]), array([0])]
    

    using this output to average another array

    a = np.array([[1, 2, 2], [2, 3, 5]])
    b = np.array([[10, 20, 30], [40, 50, 60]])
    
    m = a == 2
    i, j = np.where(m)
    # (array([0, 0, 1]), array([1, 2, 0]))
    
    idx = np.r_[0, np.diff(i).nonzero()[0]+1]
    # array([0, 2])
    
    out = np.add.reduceat(b[m], idx)/np.add.reduceat(m[m], idx)
    # array([50, 40])/array([2, 1])
    

    Output:

    array([25., 40.])
    
    handling NaNs:
    a = np.array([[1, 2, 2], [2, 3, 5]])
    b = np.array([[10, 20, np.nan], [40, 50, 60]])
    
    m = a == 2
    i, j = np.where(m)
    # (array([0, 0, 1]), array([1, 2, 0]))
    
    idx = np.r_[0, np.diff(i).nonzero()[0]+1]
    # array([0, 2])
    
    b_m = b[m]
    # array([20., nan, 40.])
    nans = np.isnan(b_m)
    # array([False,  True, False])
    
    out = np.add.reduceat(np.where(nans, 0, b_m), idx)/np.add.reduceat(~nans, idx)
    # array([20., 40.])/array([1, 1])
    

    Output:

    array([20., 40.])