pythontensorflowdeep-learningtf.kerasdot-product

How to get the dot product for two embedding layers in tensorflow.keras using the sequential class and set weights for the embedding layers?


I'm trying to build a model using tensorflow.keras in which I get the dot product of two embedding layers with predefined weights (which I'll optimize when compiling the model). My goal during inference time will be to use embedding_layer_1 as a lookup table in order to get the weights of the embedding_layer_1 according to the index specified, which is why I keep trainable=True.

The shape of the weights_matrix is (288, 3569), and I like to be able to get the dot product of the embedding_layer_1 (shape is (288, 3569)) and the embedding_layer_2 (shape is (3569, 288)). So basically embedding_layer_2 is the transpose of embedding_layer_1.

This is the network:

import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Flatten, Embedding, Dot, Flatten

input_dim=288
output_dim=3569

model_1 = Sequential()
embedding_layer_1 = Embedding(input_dim=input_dim, output_dim=output_dim, name='embedding_layer_1', dtype='float64', trainable=True,  input_length=1)
embedding_layer_1.build((None,))
model_1.add(embedding_layer_1)
model_1.layers[0].set_weights([weights_matrix])
model_1.add(Flatten())

model_2 = Sequential()
embedding_layer_2 = Embedding(input_dim=input_dim, output_dim=output_dim, name='embedding_layer_2', dtype='float64', trainable=True,  input_length=1)
embedding_layer_2.build((None,))
model_2.add(embedding_layer_2)
model_2.layers[0].set_weights([role_skill_matrix])
model_2.add(Flatten())

dot_product = Dot(axes=-1)([model_1.output, model_2.output])

model = Sequential([model_1, model_2, dot_product])
model.summary()

I get the following error. I tried various changes on the above network such as transposing embedding_layer_1 and trying to get the dot product of embedding_layer_1 and its transposed, as well as other variations yet nothing worked.

I'm using tensorflow==2.8.0 and keras==2.8.0.

    model = Sequential([model_1, model_2, dot_product])
  File "/Users/ayalaallon/opt/anaconda3/envs/ml-pipeline/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 629, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/Users/ayalaallon/opt/anaconda3/envs/ml-pipeline/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/Users/ayalaallon/opt/anaconda3/envs/ml-pipeline/lib/python3.8/site-packages/keras/engine/sequential.py", line 178, in add
    raise TypeError('The added layer must be an instance of class Layer. '
TypeError: The added layer must be an instance of class Layer. Received: layer=KerasTensor(type_spec=TensorSpec(shape=(None, 1), dtype=tf.float32, name=None), name='dot/Squeeze:0', description="created by layer 'dot'") of type <class 'keras.engine.keras_tensor.KerasTensor'>.

Process finished with exit code 1

Solution

  • I am not sure if you can use layers like the Dot layer in a Sequential Model. The tricky part is that the layers in a Sequential model are stacked.. well, sequentially. How should a layer have more than one input in this type of model, I don't know.
    Fortunately, you can just use the functional API to construct the Model, and you started to use the functional syntax at the Dot layer construction yourself.

    import tensorflow as tf
    from tensorflow.keras import Model
    from tensorflow.keras.layers import Flatten, Embedding, Dot, Flatten
    
    input_dim=288
    output_dim=3569
    
    in1 = tf.keras.layers.Input(input_dim)
    embedding_layer_1 = Embedding(input_dim=input_dim, output_dim=output_dim, name='embedding_layer_1', 
                                  dtype='float64', trainable=True,  input_length=1)(in1)
    flat1 = Flatten()(embedding_layer_1)
    
    in2 = tf.keras.layers.Input(input_dim)
    embedding_layer_2 = Embedding(input_dim=input_dim, output_dim=output_dim, name='embedding_layer_2',
                                  dtype='float64', trainable=True,  input_length=1)(in2)
    flat2 = Flatten()(embedding_layer_2)
    
    dot_product = Dot(axes=-1)([flat1, flat2])
    
    model = Model(inputs=[in1, in2], outputs=[dot_product])
    model.summary()
    

    Apparently, the model needs the Input layer as first layers, without them there is an error with the Embedding layers. It runs with the following code without errors:

    import numpy as np
    model.compile()
    
    x = np.random.rand(100, 288)
    model([x, x])