I want to achieve the following result fast
for i in range(n):
C[index[i]] += A[i]
where C
is an d * m matrix, index
is an array containing n integers ranging in [0, d)
, and A
is an n * m matrix.
I wish to find a faster, more numpy approach.
Assuming C
is (d, m)
(I don't see how it could work otherwise), you need to collect and sum the values from A
in order, then add them to the correct indices in C
. For that you can use argsort
, unique
and add.reduce
:
# sort the indices in index
order = np.argsort(index)
# get the unique indices from the sorted index
# and the position of the first occurrence
idx, idx2 = np.unique(index[order], return_index=True)
C1[idx] += np.add.reduceat(A[order], idx2)
Example:
d, n, m = 4, 5, 3
np.random.seed(1)
# C is an d * m matrix
C = np.random.random((d, m))
# make copies to compare the vectorial and loop approaches
C1 = C.copy()
C2 = C.copy()
# index is an array containing n integers ranging in [0, d)
index = np.random.randint(0, d, n)
# array([3, 0, 2, 0, 1])
# A is an n * m matrix
A = np.random.random((n, m))
order = np.argsort(index)
idx, idx2 = np.unique(index[order], return_index=True)
C1[idx] += np.add.reduceat(A[order], idx2)
for i in range(n):
C2[index[i]] += A[i]
np.allclose(C1, C2)
# True