numpy tensorflow keras optimization tensor

Optimize loss over pixel values across two 4D-tensors

I've been trying to implement a loss function (with tensorflow/keras) for predicting an orientation map based on a particular paper I found useful. The authors do this by predicting a sine and cosine value for each pixel (on each channel of the output) and then obtaining a distance measure with the following function

θ^(1+δ) = (arccos(cosα · cosβ + sinα · sinβ))^(1+δ)

for the sake of completeness, the gradient of it is (1 + δ) · θ^δ, where δ=0.2

Given alpha is y_true and beta is y_pred, and that those tensors have shape (batch,height,width,channels), I devised an implementation that uses nested for loops, it may work but it's unoptimized and I'm not sure if Keras will be able to backpropagate it as I have little experience with ML.

I'd like to know if there is a better optimized way of implementing this than the current code (below), as I couldn't find it anywhere, so I'm writing my first question here. The min and max functions are used to clip the values to the interval [10^-6, 1-10^-6]

def angle_distance_loss(y_true,y_pred):
    """
    Lproposed = (arccos(cosα · cosβ + sinα · sinβ))^(1+δ)
    """

    batch, height, width, channels = y_true.shape
    cos_c = 0
    sin_c = 1
    for batch_i in range(batch):
        for h_j in range(height):
            for w_k in range(width):
                yt_cos = y_true[batch_i][h_j][w_k][cos_c]
                yt_sin = y_true[batch_i][h_j][w_k][sin_c]

                yp_cos = y_pred[batch_i][h_j][w_k][cos_c]
                yp_sin = y_pred[batch_i][h_j][w_k][sin_c]

                l += math.acos(max(10**-6, min((yt_cos * yp_cos + yt_sin * yp_sin), 1-10**-6))) ** (1.2)
                
    return l / batch * width * height

Any input on this is welcome :)

Solution

This looks like a good start. But as you mentioned, nested loops and your used functions are not really efficient. Instead, you should use TensorFlow's built-in vector operations:

tf.clip_by_value instead of min/max
tf.acos instead of math.acos
tf.reduce_sum and tf.reduce_mean instead of summing up and dividing by the size

These opertions are fully differentiable, so you should have no issues with backpropagation.

Here is a more optimized implementation:

def angle_distance_loss(y_true: tf.Tensor, y_pred: tf.Tensor, delta: float = 0.2, epsilon: float = 1e-6) -> tf.Tensor:
    """
    Lproposed = (arccos(cosα · cosβ + sinα · sinβ))^(1+δ)
    """
    dot_product = tf.clip_by_value(
        tf.reduce_sum(y_true * y_pred, axis=-1), 
        clip_value_min=epsilon, 
        clip_value_max=1-epsilon
    )  # Compute the dot product of the cosine and sine components
    angle_distance = tf.acos(dot_product) ** (1 + delta)  # Compute the angle distance
    return tf.reduce_mean(angle_distance)  # Return the mean loss over the batch

I did no heavy testing on this, so before relying on this, you should verify (e.g., by simultaneous calculation of both implementations and checking equality via debugging or printing).