I'm creating an LSTM Autoencoder for feature extraction for my master's thesis. However, I'm having a lot of trouble with combining dropout with LSTM layers.
Since it's an Autoencoder, I'm having a bottleneck which is achieved by having two separate LSTM layers, each with num_layers=1, and a dropout in between. I have time series with very different lengths and have found packed sequences to be a good idea for that reason.
But, from my experiments, I must pack the data before the first LSTM, unpack before the dropout, then pack again before the second LSTM. This seems wildly inefficient. Is there a better way? I'm providing some example code and an alternative way to implement it below.
Current, working, but possibly suboptimal solution:
class Encoder(nn.Module):
def __init__(self, seq_len, n_features, embedding_dim, hidden_dim, dropout):
super(Encoder, self).__init__()
self.seq_len = seq_len
self.n_features = n_features
self.embedding_dim = embedding_dim
self.hidden_dim = hidden_dim
self.lstm1 = nn.LSTM(
input_size=n_features,
hidden_size=self.hidden_dim,
num_layers=1,
batch_first=True,
)
self.lstm2 = nn.LSTM(
input_size=self.hidden_dim,
hidden_size=embedding_dim,
num_layers=1,
batch_first=True,
)
self.drop1 = nn.Dropout(p=dropout, inplace=False)
def forward(self, x):
x, (_, _) = self.lstm1(x)
x, lens = pad_packed_sequence(x, batch_first=True, total_length=self.seq_len)
x = self.drop1(x)
x = pack_padded_sequence(x, lens, batch_first=True, enforce_sorted=False)
x, (hidden_n, _) = self.lstm2(x)
return hidden_n.reshape((-1, self.n_features, self.embedding_dim)), lens
Alternative, possibly better, but currently not working solution;
class Encoder2(nn.Module):
def __init__(self, seq_len, n_features, embedding_dim, hidden_dim, dropout):
super(Encoder2, self).__init__()
self.seq_len = seq_len
self.n_features = n_features
self.embedding_dim = embedding_dim
self.hidden_dim = hidden_dim
self.lstm1 = nn.LSTM(
input_size=n_features,
hidden_size=self.hidden_dim,
num_layers=2,
batch_first=True,
dropout=dropout,
proj_size=self.embedding_dim,
)
def forward(self, x):
_, (h_n, _) = self.lstm1(x)
return h_n[-1].unsqueeze(1), lens
Any help and tips about working with time-series, packed sequences, lstm-cells and dropout would be immensely appreciated, as I'm not finding much documentation/guidance elsewhere on the internet. Thank you!
Best, Lars Ankile
For the hereafter, after a lot of trial and error, the following full code for the Autoencoder seems to work very well. Getting the packing and unpacking to work correctly was the main hurdle. The clue is, I think, to try to utilize the LSTM modules for what they're worth by using the proj_size
, num_layers
, and dropout
parameters.
class EncoderV4(nn.Module):
def __init__(
self, seq_len, n_features, embedding_dim, hidden_dim, dropout, num_layers
):
super().__init__()
self.seq_len = seq_len
self.n_features = n_features
self.embedding_dim = embedding_dim
self.hidden_dim = hidden_dim
self.num_layers = num_layers
self.lstm1 = nn.LSTM(
input_size=n_features,
hidden_size=self.hidden_dim,
num_layers=num_layers,
batch_first=True,
dropout=dropout,
proj_size=self.embedding_dim,
)
def forward(self, x):
_, (h_n, _) = self.lstm1(x)
return h_n[-1].unsqueeze(1)
class DecoderV4(nn.Module):
def __init__(self, seq_len, input_dim, hidden_dim, n_features, num_layers):
super().__init__()
self.seq_len = seq_len
self.input_dim = input_dim
self.hidden_dim = hidden_dim
self.n_features = n_features
self.num_layers = num_layers
self.lstm1 = nn.LSTM(
input_size=input_dim,
hidden_size=hidden_dim,
num_layers=num_layers,
proj_size=n_features,
batch_first=True,
)
def forward(self, x, lens):
x = x.repeat(1, self.seq_len, 1)
x = pack_padded_sequence(x, lens, batch_first=True, enforce_sorted=False)
x, _ = self.lstm1(x)
return x
class RecurrentAutoencoderV4(nn.Module):
def __init__(
self, seq_len, n_features, embedding_dim, hidden_dim, dropout, num_layers
):
super().__init__()
self.encoder = EncoderV4(
seq_len, n_features, embedding_dim, hidden_dim, dropout, num_layers
)
self.decoder = DecoderV4(
seq_len, embedding_dim, hidden_dim, n_features, num_layers
)
def forward(self, x, lens):
x = self.encoder(x)
x = self.decoder(x, lens)
return x
The full code and a paper using this Autoencoder can be found at GitHub and arXiv, respectively.