pythontensorflowkerasglove

averaging a sentence’s word vectors in Keras- Pre-trained Word Embedding


I am new to Keras.

My goal is to create a Neural Network Multi-Classification for Sentiment Analysis for tweets.

I used Sequential in Keras to build my model.

I want to use pre-trained word embeddings in the first layer of my model, specifically gloVe.

Here is my model currently:

model = Sequential()
model.add(Embedding(vocab_size, 300, weights=[embedding_matrix], input_length=max_length, trainable=False))
model.add(LSTM(100, stateful=False))
model.add(Dense(8, input_dim=4, activation='relu'))
model.add(Dense(3, activation='softmax'))

embedding_matrix is filled by the vectors coming from the file glove.840B.300d.txt

Since my input to the neural network model is sentences (or tweets), and after consulting some theory, I want for the layer after the Embedding layer, after taking every word vector in the tweet, to average the sentence’s word vectors.

Currently what I use is LSTM, I want to replace it with this technique of averaging technique or p-means. I wasn't able to find this in keras documentation.

I'm not sure if this is the right place to ask this, but all help will be appreciated.


Solution

  • You can use the mean function from Keras' backend and wrap it in a Lambda layer to average the embeddings over the words.

    import keras
    from keras.layers import Embedding
    from keras.models import Sequential
    import numpy as np
    # Set parameters
    vocab_size=1000
    max_length=10
    # Generate random embedding matrix for sake of illustration
    embedding_matrix = np.random.rand(vocab_size,300)
    
    model = Sequential()
    model.add(Embedding(vocab_size, 300, weights=[embedding_matrix], 
    input_length=max_length, trainable=False))
    # Average the output of the Embedding layer over the word dimension
    model.add(keras.layers.Lambda(lambda x: keras.backend.mean(x, axis=1)))
    
    model.summary()
    

    Gives:

    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    embedding_6 (Embedding)      (None, 10, 300)           300000    
    _________________________________________________________________
    lambda_6 (Lambda)            (None, 300)               0         
    =================================================================
    Total params: 300,000
    Trainable params: 0
    Non-trainable params: 300,000
    

    Furthermore, you can use the Lambda layer to wrap arbitrary functions that operate on tensors in a Keras layer and add them to your model. If you are using the TensorFlow backend, you have access to tensorflow ops as well:

    import tensorflow as tf    
    model = Sequential()
    model.add(Embedding(vocab_size, 300, weights=[embedding_matrix], 
    input_length=max_length, trainable=False))
    model.add(keras.layers.Lambda(lambda x: tf.reduce_mean(x, axis=1)))
    # same model as before
    

    This can help to implement more custom averaging functions.