python tensorflow machine-learning tensor ragged-tensors

How do I subtract a tensor from ragged tensor?

I am asking a question very similar to that of TensorFlow broadcasting of RaggedTensor

Basically, I am doing machine learning, and one data sample consists of a list of lists of coordinates, where each list of coordinates represents a drawing stroke on a canvas. One such sample might be,

[
[[2,3], [2,4], [3,6], [4,8]], 
[[7,3], [10,9]], 
[[10,12], [14,17], [13,15]]
]

I would like to normalize these coordinates by subtracting by the mean and dividing by the standard deviation. Specifically, I want to find the mean and standard deviation of all the x-coordinates (index=0) and y-coordinates (index=1), respectively. I got these values by

list_points=tf.ragged.constant(list_points)
STD=tf.math.reduce_std(list_points, axis=(0,1))
mean=tf.reduce_mean(list_points, axis=(0,1))

STD and mean both have shape of (2,)

Now, I want to subtract the mean from list_points (this is the sample list of lists of coordinates), but it seems that for ragged_rank=3, I can only subtract by a scalar or a tensor that covers every single data point. Is there an easy way that I can simply subtract the RaggedTensor by a Tensor of shape (2,)?

I have tried to simply subtract mean from list_points directly, but whatever I do, I get this error:

ValueError: pylist has scalar values depth 3, but ragged_rank=3 requires scalar value depth greater than 3

Solution

In your case, ragged_rank is 1 in fact. Thus tf.reduce_mean can be used as follows

list_points = tf.ragged.constant(
    [
        [[2,3], [2,4], [3,6], [4,8]], 
        [[7,3], [10,9]], 
        [[10,12], [14,17], [13,15]]
    ],
    ragged_rank=1,
    dtype=tf.float32
)
list_points.shape
# TensorShape([3, None, 2])

mean = tf.reduce_mean(list_points, axis=[0, 1])
# tf.Tensor: shape=(2,), dtype=float64, numpy=array([7.22222222, 8.55555556])>

std = tf.reduce_mean(list_points**2, axis=[0, 1]) - tf.reduce_mean(list_points, axis=[0, 1])**2
# <tf.Tensor: shape=(2,), dtype=float64, numpy=array([19.72839506, 23.80246914])>

We can subtract (add, multiply, etc) from a ragged tensor of ragged rank 1 an ordinary tensor of rank 1 if their first dimensions coincides.

list_points - mean
# <tf.RaggedTensor [[[-5.2222223, -5.5555553],
#   [-5.2222223, -4.5555553],
#   [-4.2222223, -2.5555553],
#   [-3.2222223, -0.55555534]], [[-0.22222233, -5.5555553],
#                                [2.7777777, 0.44444466]]  ,
#  [[2.7777777, 3.4444447],
#   [6.7777777, 8.444445],
#   [5.7777777, 6.4444447]]]>

This is possible because under the hood of the raged tensor of ragged rank 1 we have an ordinary tensor

list_points.values.shape
# TensorShape([9, 2])

For ragged_rank > 1 case we can attract tf.math.segment_mean that is more tricky.