I want to apply attention weights (5 label) to my convolution with 3 filters, could any help me to do how to apply matmul. Appreciated if you give tensorflow version as well.
import numpy as np
conv = np.random.randint(10,size=[1,3,2,2], dtype=int) # [batches,filter,row,col]
attention = np.random.randint(5,size=[1,5,2,1], dtype=int) # [batches,label,row,col]
np.matmul(conv,attention).shape # expected output size [1,3,5,2,1] [batches,filter,label,row,col]
ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (1,3,2,2)->(1,3,2,newaxis,2) (1,5,2,1)->(1,5,newaxis,1,2)
According to the docs of matmul
:
If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.
and
Stacks of matrices are broadcast together as if the matrices were elements.
This means that in your case, all but the last two dimensions need to match up. If you want the output shape to be 1, 3, 5, 2, 1
, you will need to explicitly insert an empty axis into each array. You can do that at creation time:
import numpy as np conv = np.random.randint(10, size=[1, 3, 1, 2, 2], dtype=int) attention = np.random.randint(5, size=[1, 1, 5,2,1], dtype=int) np.matmul(conv,attention).shape
Alternatively, you can make the insertion explicit by multiplying views with the appropriate insertions:
np.matmul(conv[:, :, np.newaxis, ...], attention[:, np.newaxis, ...]).shape