In PyTorch there is a LSTM module which in addition to input sequence, hidden states, and cell states accepts a num_layers
argument which specifies how many layers will our LSTM have.
There is however another module LSTMCell which has just input size and number of hidden states as parameters, there is no num_layers
since this is a single cell in a multi-layered LSTM.
My question is what is the proper way to connect together the LSTMCell modules to achieve a same effect as a multi layered LSTM with num_layers > 1
LSTMCell is the basic building block of an LSTM network. You should use the LSTM module (which uses LSTMCell internally). If you want to do this yourself, the best way is to read the source code (https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/rnn.py).
Basically you want to use one LSTMCell for each layer, and you should be careful on how to go from input to output, layer by layer taking into account the hidden states. I also have basic implementation of a convolutional LSTM but the idea is the same. You can check it here: https://github.com/rogertrullo/pytorch_convlstm/