numpyarray-broadcasting

How to add each row of a matrix to the selected rows of another matrix?


I want to achieve the following result fast

for i in range(n):
    C[index[i]] += A[i]

where C is an d * m matrix, index is an array containing n integers ranging in [0, d), and A is an n * m matrix.

I wish to find a faster, more numpy approach.


Solution

  • Assuming C is (d, m) (I don't see how it could work otherwise), you need to collect and sum the values from A in order, then add them to the correct indices in C. For that you can use argsort, unique and add.reduce:

    # sort the indices in index
    order = np.argsort(index)
    
    # get the unique indices from the sorted index
    # and the position of the first occurrence
    idx, idx2 = np.unique(index[order], return_index=True)
    
    C1[idx] += np.add.reduceat(A[order], idx2)
    

    Example:

    d, n, m = 4, 5, 3
    np.random.seed(1)
    
    # C is an d * m matrix
    C = np.random.random((d, m))
    # make copies to compare the vectorial and loop approaches
    C1 = C.copy()
    C2 = C.copy()
    
    # index is an array containing n integers ranging in [0, d)
    index = np.random.randint(0, d, n)
    # array([3, 0, 2, 0, 1])
    
    # A is an n * m matrix
    A = np.random.random((n, m))
    
    order = np.argsort(index)
    idx, idx2 = np.unique(index[order], return_index=True)
    
    C1[idx] += np.add.reduceat(A[order], idx2)
    
    for i in range(n):
        C2[index[i]] += A[i]
    
    np.allclose(C1, C2)
    # True