pythontensorflowmachine-learningtensorragged-tensors

How do I subtract a tensor from ragged tensor?


I am asking a question very similar to that of TensorFlow broadcasting of RaggedTensor

Basically, I am doing machine learning, and one data sample consists of a list of lists of coordinates, where each list of coordinates represents a drawing stroke on a canvas. One such sample might be,

[
[[2,3], [2,4], [3,6], [4,8]], 
[[7,3], [10,9]], 
[[10,12], [14,17], [13,15]]
]

I would like to normalize these coordinates by subtracting by the mean and dividing by the standard deviation. Specifically, I want to find the mean and standard deviation of all the x-coordinates (index=0) and y-coordinates (index=1), respectively. I got these values by

list_points=tf.ragged.constant(list_points)
STD=tf.math.reduce_std(list_points, axis=(0,1))
mean=tf.reduce_mean(list_points, axis=(0,1))

STD and mean both have shape of (2,)

Now, I want to subtract the mean from list_points (this is the sample list of lists of coordinates), but it seems that for ragged_rank=3, I can only subtract by a scalar or a tensor that covers every single data point. Is there an easy way that I can simply subtract the RaggedTensor by a Tensor of shape (2,)?

I have tried to simply subtract mean from list_points directly, but whatever I do, I get this error:

ValueError: pylist has scalar values depth 3, but ragged_rank=3 requires scalar value depth greater than 3


Solution

  • In your case, ragged_rank is 1 in fact. Thus tf.reduce_mean can be used as follows

    list_points = tf.ragged.constant(
        [
            [[2,3], [2,4], [3,6], [4,8]], 
            [[7,3], [10,9]], 
            [[10,12], [14,17], [13,15]]
        ],
        ragged_rank=1,
        dtype=tf.float32
    )
    list_points.shape
    # TensorShape([3, None, 2])
    
    mean = tf.reduce_mean(list_points, axis=[0, 1])
    # tf.Tensor: shape=(2,), dtype=float64, numpy=array([7.22222222, 8.55555556])>
    
    std = tf.reduce_mean(list_points**2, axis=[0, 1]) - tf.reduce_mean(list_points, axis=[0, 1])**2
    # <tf.Tensor: shape=(2,), dtype=float64, numpy=array([19.72839506, 23.80246914])>
    

    We can subtract (add, multiply, etc) from a ragged tensor of ragged rank 1 an ordinary tensor of rank 1 if their first dimensions coincides.

    list_points - mean
    # <tf.RaggedTensor [[[-5.2222223, -5.5555553],
    #   [-5.2222223, -4.5555553],
    #   [-4.2222223, -2.5555553],
    #   [-3.2222223, -0.55555534]], [[-0.22222233, -5.5555553],
    #                                [2.7777777, 0.44444466]]  ,
    #  [[2.7777777, 3.4444447],
    #   [6.7777777, 8.444445],
    #   [5.7777777, 6.4444447]]]>
    

    This is possible because under the hood of the raged tensor of ragged rank 1 we have an ordinary tensor

    list_points.values.shape
    # TensorShape([9, 2])
    

    For ragged_rank > 1 case we can attract tf.math.segment_mean that is more tricky.