I've been trying to implement a loss function (with tensorflow/keras) for predicting an orientation map based on a particular paper I found useful. The authors do this by predicting a sine and cosine value for each pixel (on each channel of the output) and then obtaining a distance measure with the following function
θ^(1+δ) = (arccos(cosα · cosβ + sinα · sinβ))^(1+δ)
for the sake of completeness, the gradient of it is (1 + δ) · θ^δ, where δ=0.2
Given alpha is y_true and beta is y_pred, and that those tensors have shape (batch,height,width,channels), I devised an implementation that uses nested for loops, it may work but it's unoptimized and I'm not sure if Keras will be able to backpropagate it as I have little experience with ML.
I'd like to know if there is a better optimized way of implementing this than the current code (below), as I couldn't find it anywhere, so I'm writing my first question here. The min and max functions are used to clip the values to the interval [10^-6, 1-10^-6]
def angle_distance_loss(y_true,y_pred):
"""
Lproposed = (arccos(cosα · cosβ + sinα · sinβ))^(1+δ)
"""
batch, height, width, channels = y_true.shape
cos_c = 0
sin_c = 1
for batch_i in range(batch):
for h_j in range(height):
for w_k in range(width):
yt_cos = y_true[batch_i][h_j][w_k][cos_c]
yt_sin = y_true[batch_i][h_j][w_k][sin_c]
yp_cos = y_pred[batch_i][h_j][w_k][cos_c]
yp_sin = y_pred[batch_i][h_j][w_k][sin_c]
l += math.acos(max(10**-6, min((yt_cos * yp_cos + yt_sin * yp_sin), 1-10**-6))) ** (1.2)
return l / batch * width * height
Any input on this is welcome :)
This looks like a good start. But as you mentioned, nested loops and your used functions are not really efficient. Instead, you should use TensorFlow's built-in vector operations:
tf.clip_by_value
instead of min
/max
tf.acos
instead of math.acos
tf.reduce_sum
and tf.reduce_mean
instead of summing up and
dividing by the sizeThese opertions are fully differentiable, so you should have no issues with backpropagation.
Here is a more optimized implementation:
def angle_distance_loss(y_true: tf.Tensor, y_pred: tf.Tensor, delta: float = 0.2, epsilon: float = 1e-6) -> tf.Tensor:
"""
Lproposed = (arccos(cosα · cosβ + sinα · sinβ))^(1+δ)
"""
dot_product = tf.clip_by_value(
tf.reduce_sum(y_true * y_pred, axis=-1),
clip_value_min=epsilon,
clip_value_max=1-epsilon
) # Compute the dot product of the cosine and sine components
angle_distance = tf.acos(dot_product) ** (1 + delta) # Compute the angle distance
return tf.reduce_mean(angle_distance) # Return the mean loss over the batch
I did no heavy testing on this, so before relying on this, you should verify (e.g., by simultaneous calculation of both implementations and checking equality via debugging or printing).