tensorflow-probability

Confusing output of tensorflow_probability.bijectors.ScaleMatvecDiag. Left or right multiplication?


The official page says, Compute Y = g(X; scale) = scale @ X. So I understand scale is left-multiplied to X, but I see that ScaleMatvecDiag calculates X @ scale.

The following code produces

import numpy as np
import tensorflow_probability as tfp
tfb = tfp.bijectors

x = [[1., 2.], [3., 4.]]
b = tfb.ScaleMatvecDiag(scale_diag=[-1., 2.])
b.forward(x)

[[-1.,  4.],
 [-3.,  8.]]

I am expecting

np.diag([-1., 2.]) @ x

[[-1., -2.],
 [ 6.,  8.]]

From the following outputs, I see that ScaleMatvecDiag calculates X @ scale.

y = [[1., 2, 3], [4, 5, 6]]
z = [[1., 2], [3, 4], [5, 6]]

b.forward(y) --> ValueError: Dimensions 2 and 3 are not compatible
b.forward(z) --> (3, 2)

I would be appreciated if anyone clarify the misunderstanding.


Solution

  • I think there's a documentation bug.

    In short, matvec != matmul (and note that @ is matmul, not matvec)

    Ignoring "batching":

    Taking batching into account:

    The right-hand sides of your examples are being interpreted as batches of vectors:

    only the batches of 2-vectors will be admissible to a matvec with a 2x2 left-hand side (matrix).