How do we build a simple linear autoencoder and train it using torch.optim optimisers?
How do I do it using autograd (.backward()) and optimising the MSE loss, and then learn the values of the weights and biases in the encoder, and the decoder (ie. 3 parameters in the encoder and 4 in the decoder)? And the data has to be randomized, for each run of learning, start from random weights and biases, such as:
wEncoder = torch.randn(D,1, requires_grad=True)
wDecoder = torch.randn(1,D, requires_grad=True)
bEncoder = torch.randn(1, requires_grad=True)
bDecoder = torch.randn(1,D, requires_grad=True)
The target optimizer is SGD, learning rate 0.01, no momentum, and 1000 steps (from a random start), then how do we plot loss versus epochs (steps)?
I tried this but the losses are the same for every epoch.
D = 2
x = torch.rand(100,D)
x[:,0] = x[:,0] + x[:,1]
x[:,1] = 0.5*x[:,0] + x[:,1]
loss_fn = nn.MSELoss()
optimizer = optim.SGD([x[:,0],x[:,1]], lr=0.01)
losses = []
for epoch in range(1000):
running_loss = 0.0
inputs = x_reconstructed
targets = x
running_loss += loss.item()
epoch_loss = running_loss / len(data)
This example should get you going. Please see code comments for further explanation:
import torch
# Use torch.nn.Module to create models
class AutoEncoder(torch.nn.Module):
def __init__(self, features: int, hidden: int):
# Necessary in order to log C++ API usage and other internals
self.encoder = torch.nn.Linear(features, hidden)
self.decoder = torch.nn.Linear(hidden, features)
def forward(self, X):
return self.decoder(self.encoder(X))
def encode(self, X):
return self.encoder(X)
# Random data
data = torch.rand(100, 4)
model = AutoEncoder(4, 10)
# Pass model.parameters() for increased readability
# Weights of encoder and decoder will be passed
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
loss_fn = torch.nn.MSELoss()
# Per-epoch losses are gathered
# Loss is the mean of batch elements, in our case mean of 100 elements
losses = []
for epoch in range(1000):
reconstructed = model(data)
loss = loss_fn(reconstructed, data)
# No need to retain_graph=True as you are not performing multiple passes
# of backpropagation
Please notice linear autoencoder is roughly equivalent to PCA decomposition, which is more efficient.
You should probably use a non-linear autoencoder unless it is simply for training purposes.