I have trained a Siamese neural network that uses triplet loss. It was a pain, but I think I managed to do it. However, I am struggling to understand how to make evaluations with this model.
The SNN:
def triplet_loss(y_true, y_pred):
margin = K.constant(1)
return K.mean(K.maximum(K.constant(0), K.square(y_pred[:,0]) - 0.5*(K.square(y_pred[:,1])+K.square(y_pred[:,2])) + margin))
def euclidean_distance(vects):
x, y = vects
return K.sqrt(K.maximum(K.sum(K.square(x - y), axis=1, keepdims=True), K.epsilon()))
anchor_input = Input((max_len, ), name='anchor_input')
positive_input = Input((max_len, ), name='positive_input')
negative_input = Input((max_len, ), name='negative_input')
Shared_DNN = create_base_network(embedding_dim = EMBEDDING_DIM, max_len=MAX_LEN, embed_matrix=embed_matrix)
encoded_anchor = Shared_DNN(anchor_input)
encoded_positive = Shared_DNN(positive_input)
encoded_negative = Shared_DNN(negative_input)
positive_dist = Lambda(euclidean_distance, name='pos_dist')([encoded_anchor, encoded_positive])
negative_dist = Lambda(euclidean_distance, name='neg_dist')([encoded_anchor, encoded_negative])
tertiary_dist = Lambda(euclidean_distance, name='ter_dist')([encoded_positive, encoded_negative])
stacked_dists = Lambda(lambda vects: K.stack(vects, axis=1), name='stacked_dists')([positive_dist, negative_dist, tertiary_dist])
model = Model([anchor_input, positive_input, negative_input], stacked_dists, name='triple_siamese')
model.compile(loss=triplet_loss, optimizer=adam_optim, metrics=[accuracy])
history = model.fit([Anchor,Positive,Negative],y=Y_dummy,validation_data=([Anchor_test,Positive_test,Negative_test],Y_dummy2), batch_size=128, epochs=25)
I understand that once a model is trained with triplets, the evaluation shouldn't actually require that triplets be used. However, how do I finagle this reshaping?
Because this is a SNN, I would want to feed two inputs into model.evaluate
, along with a categorical variable denoting if the two inputs are similar or not (1 = similar, 0 = not similar)
.
So basically, I want model.evaluate(input1, input2, y_label)
. But I am not sure how to get this with the model that I trained. As shown above, I trained with three inputs: model.fit([Anchor,Positive,Negative],y=Y_dummy ... )
.
I know I should save the weights of my trained model, but I just don't know what model to load the weights onto.
Your help is greatly appreciated!
EDIT:
I am aware of the below approach for prediction, but I am not looking for prediction, I am looking to use model.evaluate
as I want to get some final measure of loss/accuracy for the model. Also this approach only feeds the anchor into the model (wheras I'm interested in text similarity, so would want to feed in 2 inputs)
eval_model = Model(inputs=anchor_input, outputs=encoded_anchor)
eval_model.load_weights('weights.hdf5')
Considering that eval_model
is trained to produce embeddings, I think that should be good to evaluate the similarity between two embeddings using cosine similarity.
Following the TF documentation, the cosine similarity is a number between -1 and 1. When it is a negative number closer to -1, it indicates greater similarity. When it is a positive number closer to 1, it indicates greater dissimilarity.
We can simply calculate the cosine similarity between Positive and Negative inputs for all the samples at disposal. When the cosine similarity is < 0 we can say that the two inputs are similar (1 = similar, 0 = not similar)
. In the end, is possible to calculate the binary accuracy as a final metric.
We can make all the calculations using TF and without the need of using model.evaluate
.
eval_model = Model(inputs=anchor_input, outputs=encoded_anchor)
eval_model.load_weights('weights.hdf5')
cos_sim = tf.keras.losses.cosine_similarity(
eval_model(X1), eval_model(X2)
).numpy().reshape(-1,1)
accuracy = tf.reduce_mean(tf.keras.metrics.binary_accuracy(Y, -cos_sim, threshold=0))
Another approach consists in computing the cosine similarity between the anchor and positive images and comparing it with the similarity between the anchor and the negative images.
eval_model = Model(inputs=anchor_input, outputs=encoded_anchor)
eval_model.load_weights('weights.hdf5')
positive_similarity = tf.keras.losses.cosine_similarity(
eval_model(X_anchor), eval_model(X_positive)
).numpy().mean()
negative_similarity = tf.keras.losses.cosine_similarity(
eval_model(X_anchor), eval_model(X_negative)
).numpy().mean()
We should expect the similarity between the anchor and positive images to be larger than the similarity between the anchor and the negative images.