[SOLVED] Lasagne use image inputs as the initial hidden state of a LSTMLayer

Lasagne use image inputs as the initial hidden state of a LSTMLayer

I am doing a project on image captioning. I want to set a batch of image features with shape=(batch_size, 512) as the initial hidden state of a LSTMLayer in Lasagne (theano). The sequence input to the LSTMLayer is a batch of text sequence with shape=(batch_size, max_sequence_length, 512). I notice that LSTMLayer in lasagne has a hid_init parameter. Does anyone know how to use it for LSTMLayer in Lasagne? Do I need to implement a custom LSTMLayer by myself?

Solution

You dont need to set h_0 parameter, because h_0 uses c0 (see this enter link description here and write down connections from h0 to c0), so, you have to set only c0 parameter:

decoder = LSTMLayer(l_word_embeddings,
                num_units=LSTM_UNITS,
                cell_init=your_image_features_layer_512_shape, #this is c0
                mask_input=l_mask)

You can set c0 as Layer or as other arrays (see lasagne LSTM doc enter link description here).

Ready to discuss further.