tensorflowlstmrecurrent-neural-networkbidirectionalmulti-layer

How to use multilayered bidirectional LSTM in Tensorflow?


I want to know how to use multilayered bidirectional LSTM in Tensorflow.

I have already implemented the contents of bidirectional LSTM, but I wanna compare this model with the model added multi-layers.

How should I add some code in this part?

x = tf.unstack(tf.transpose(x, perm=[1, 0, 2]))
#print(x[0].get_shape())

# Define lstm cells with tensorflow
# Forward direction cell
lstm_fw_cell = rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)
# Backward direction cell
lstm_bw_cell = rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)

# Get lstm cell output
try:
    outputs, _, _ = rnn.static_bidirectional_rnn(lstm_fw_cell, lstm_bw_cell, x,
                                          dtype=tf.float32)
except Exception: # Old TensorFlow version only returns outputs not states
    outputs = rnn.static_bidirectional_rnn(lstm_fw_cell, lstm_bw_cell, x,
                                    dtype=tf.float32)

# Linear activation, using rnn inner loop last output
outputs = tf.stack(outputs, axis=1)
outputs = tf.reshape(outputs, (batch_size*n_steps, n_hidden*2))
outputs = tf.matmul(outputs, weights['out']) + biases['out']
outputs = tf.reshape(outputs, (batch_size, n_steps, n_classes))

Solution

  • You can use two different approaches to apply multilayer bilstm model:

    1) use out of previous bilstm layer as input to the next bilstm. In the beginning you should create the arrays with forward and backward cells of length num_layers. And

    for n in range(num_layers):
            cell_fw = cell_forw[n]
            cell_bw = cell_back[n]
    
            state_fw = cell_fw.zero_state(batch_size, tf.float32)
            state_bw = cell_bw.zero_state(batch_size, tf.float32)
    
            (output_fw, output_bw), last_state = tf.nn.bidirectional_dynamic_rnn(cell_fw, cell_bw, output,
                                                                                 initial_state_fw=state_fw,
                                                                                 initial_state_bw=state_bw,
                                                                                 scope='BLSTM_'+ str(n),
                                                                                 dtype=tf.float32)
    
            output = tf.concat([output_fw, output_bw], axis=2)
    

    2) Also worth a look at another approach stacked bilstm.