pythonkeraslstmlasagne

convert Lasagne to Keras code (CNN -> LSTM)


I would like to convert this Lasagne code:

et = {}
net['input'] = lasagne.layers.InputLayer((100, 1, 24, 113))
net['conv1/5x1'] = lasagne.layers.Conv2DLayer(net['input'], 64, (5, 1))
net['shuff'] = lasagne.layers.DimshuffleLayer(net['conv1/5x1'], (0, 2, 1, 3))
net['lstm1'] = lasagne.layers.LSTMLayer(net['shuff'], 128)

in Keras code. Currently I came up with this:

multi_input = Input(shape=(1, 24, 113), name='multi_input')
y = Conv2D(64, (5, 1), activation='relu', data_format='channels_first')(multi_input)
y = LSTM(128)(y)

But I get the error: Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=4


Solution

  • Solution

    from keras.layers import Input, Conv2D, LSTM, Permute, Reshape
    
    multi_input = Input(shape=(1, 24, 113), name='multi_input')
    print(multi_input.shape)  # (?, 1, 24, 113)
    
    y = Conv2D(64, (5, 1), activation='relu', data_format='channels_first')(multi_input)
    print(y.shape)  # (?, 64, 20, 113)
    
    y = Permute((2, 1, 3))(y)
    print(y.shape)  # (?, 20, 64, 113)
    
    # This line is what you missed
    # ==================================================================
    y = Reshape((int(y.shape[1]), int(y.shape[2]) * int(y.shape[3])))(y)
    # ==================================================================
    print(y.shape)  # (?, 20, 7232)
    
    y = LSTM(128)(y)
    print(y.shape)  # (?, 128)
    

    Explanations

    I put the documents of Lasagne and Keras here so you can do cross-referencing:

    Lasagne

    Recurrent layers can be used similarly to feed-forward layers except that the input shape is expected to be (batch_size, sequence_length, num_inputs)

    Keras

    Input shape

    3D tensor with shape (batch_size, timesteps, input_dim).


    Basically the API is the same, but Lasagne probably does reshape for you (I need to check the source code later). That's why you got this error:

    Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=4
    

    , since the tensor shape after Conv2D is (?, 64, 20, 113) of ndim=4

    Therefore, the solution is to reshape it to (?, 20, 7232).

    Edit

    Confirmed with the Lasagne source code, it does the trick for you:

    num_inputs = np.prod(input_shape[2:])
    

    So the correct tensor shape as input for LSTM is (?, 20, 64 * 113) = (?, 20, 7232)


    Note

    Permute is redundant here in Keras since you have to reshape anyway. The reason why I put it here is to have a "full translation" from Lasagne to Keras, and it does what DimshuffleLaye does in Lasagne.

    DimshuffleLaye is however needed in Lasagne because of the reason I mentioned in Edit, the new dimension created by Lasagne LSTM is from the multiplication of "the last two" dimensions.