tensorflowarray-broadcastingtensorflow-xla

Broadcasting between two same-rank tensors in tensorflow


I have two tensors x and s with shapes:

> x.shape
TensorShape([Dimension(None), Dimension(3), Dimension(5), Dimension(5)])
> s.shape
TensorShape([Dimension(None), Dimension(12), Dimension(5), Dimension(5)])

I want to broadcast the dot product between x and s through the dimension 1 as follows:

> x_s.shape
TensorShape([Dimension(None), Dimension(4), Dimension(5), Dimension(5)])

where

x_s[i, 0, k, l] = sum([x[i, j, k, l] * s[i, j, k, l] for j in range (3)])
x_s[i, 1, k, l] = sum([x[i, j-3, k, l] * s[i, j, k, l] for j in range (3, 6)])
x_s[i, 2, k, l] = sum([x[i, j-6, k, l] * s[i, j, k, l] for j in range (6, 9)])
x_s[i, 3, k, l] = sum([x[i, j-9, k, l] * s[i, j, k, l] for j in range (9, 12)])

I have this implementation:

s_t = tf.transpose(s, [0, 2, 3, 1]) # [None, 5, 5, 12]
x_t = tf.transpose(x, [0, 2, 3, 1]) # [None, 5, 5, 3]
x_t = tf.tile(x_t, [1, 1, 1, 4]) # [None, 5, 5, 12]

x_s = x_t * s_t # [None, 5, 5, 12]
x_s = tf.reshape(x_s, [tf.shape(x_s)[0], 5, 5, 4, 3]) # [None, 5, 5, 4, 3]
x_s = tf.reduce_sum(x_s, axis=-1) # [None, 5, 5, 4]
x_s = tf.transpose(x_s, [0, 3, 1, 2]) # [None, 4, 5, 5]

I understand this is not efficient in memory because of the tile. Also, reshape's, transpose's element-wise and reduce_sums operations can hurt the performance for larger tensors. Is there any alternative to make it cleaner?


Solution

  • Do you have any evidence that reshapes are expensive? The following uses a reshape and dimension broadcasting:

    x_s = tf.reduce_sum(tf.reshape(s, (-1, 4, 3, 5, 5)) *
                        tf.expand_dims(x, axis=1), axis=2)