I was following the answer to the question below in order to build a custom loss function in Keras that considers only the top 20 predictions.
How can I sort the values in a custom Keras / Tensorflow Loss Function?
However, when I try to compile my model using this code I get the following error about dimensions:
InvalidArgumentError: input must have last dimension >= k = 20 but is 1 for 'loss_21/dense_65_loss/TopKV2' (op: 'TopKV2') with input shapes: [?,1], [] and with computed input tensors: input[1] = <20>.
A simplified version of the code that re-produces the error is the following.
import tensorflow as tf
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense
from tensorflow.python.keras.optimizers import SGD
top = 20
def top_loss(y_true, y_pred):
y_pred_top_k, y_pred_ind_k = tf.nn.top_k(y_pred, top)
loss_per_sample = tf.reduce_mean(tf.reduce_sum(y_pred_top_k,
axis=-1))
return loss_per_sample
model = Sequential()
model.add(Dense(50, input_dim=201))
model.add(Dense(1))
sgd = SGD(lr=0.01, decay=0, momentum=0.9)
model.compile(loss=top_loss, optimizer=sgd)
and the error is thrown at the following line of the top_loss
function when the model is compiled.
y_pred_top_k, y_pred_ind_k = tf.nn.top_k(y_pred, top)
It seems that y_pred
in compile time is by default of shape [?,1]
while the tf.nn.top_k
function expects dimension at least higher than 'k` (i.e. 20).
Do I have to cast y_pred
to something so that tf.nn.top_k
knows it is of the correct dimensions?
Use:
y_pred_top_k, y_pred_ind_k = tf.nn.top_k(y_pred[:,0], top)
y_pred[:,0]
gets the predicted values of the full batch as a rank 1 tensor.
Another Problem:
However, you will still end up with problem with the last batch. Say your batch size is 32 and your train data is of size 100 then the last batch will be of size less then 20 and so tf.nn.top_k
will result in a run time error for the last batch. Just make sure your last batch size is >= 20 to avoid this issue. However a much better way is to check if the current batch is less then 20 and if so adjust your k
value to be used in the top_k
Code
import tensorflow as tf
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense
from tensorflow.python.keras.optimizers import SGD
top = tf.constant(20)
def top_loss(y_true, y_pred):
result = tf.cond(tf.math.greater(top_, tf.shape(y_true)[0]),
lambda: tf.shape(y_true)[0], lambda: top)
y_pred_top_k, y_pred_ind_k = tf.nn.top_k(y_pred[:,0], result)
loss_per_sample = tf.reduce_mean(tf.reduce_sum(y_pred_top_k,
axis=-1))
return loss_per_sample
model = Sequential()
model.add(Dense(50, input_dim=201))
model.add(Dense(1))
sgd = SGD(lr=0.01, decay=0, momentum=0.9)
model.compile(loss=top_loss, optimizer=sgd)