I need to multiply each row with a column at the corresponding row-index. Consider the graphics below in which I have a 3 x 3
matrix. The required operation is to multiply the row[0]
of matrix with col[0]
of transposed_matrix
, row[1]
of matrix with col[1]
of transposed_matrix
, and so on.
Question: How can achieve it in cupPy/Numpy Python in a smart way (i.e. without using for-loops)?
Looks like you want:
out = (A**2).sum(axis=1)
Example:
# input
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
# output
array([ 14, 77, 194])
Since you want the equivalent of sum(1*1 + 2*2 + 3*3)
for each row.
If A and B are different matrices, then use:
out = (A*B.T).sum(axis=1)