optimizationdeep-learningpytorchautoencodersgd

Linear autoencoder using Pytorch


How do we build a simple linear autoencoder and train it using torch.optim optimisers?

How do I do it using autograd (.backward()) and optimising the MSE loss, and then learn the values of the weights and biases in the encoder, and the decoder (ie. 3 parameters in the encoder and 4 in the decoder)? And the data has to be randomized, for each run of learning, start from random weights and biases, such as:

wEncoder = torch.randn(D,1, requires_grad=True)
wDecoder = torch.randn(1,D, requires_grad=True)
bEncoder = torch.randn(1, requires_grad=True)
bDecoder = torch.randn(1,D, requires_grad=True)

The target optimizer is SGD, learning rate 0.01, no momentum, and 1000 steps (from a random start), then how do we plot loss versus epochs (steps)?

I tried this but the losses are the same for every epoch.

D = 2
x = torch.rand(100,D)
x[:,0] = x[:,0] + x[:,1]
x[:,1] = 0.5*x[:,0] + x[:,1]

loss_fn = nn.MSELoss()
optimizer = optim.SGD([x[:,0],x[:,1]], lr=0.01)
losses = []
for epoch in range(1000):
    running_loss = 0.0
    inputs = x_reconstructed
    targets = x
    loss=loss_fn(inputs,targets)
    loss.backward(retain_graph=True)
    optimizer.step()
    optimizer.zero_grad()
    running_loss += loss.item() 
    epoch_loss = running_loss / len(data)
    losses.append(running_loss)

Solution

  • This example should get you going. Please see code comments for further explanation:

    import torch
    
    
    # Use torch.nn.Module to create models
    class AutoEncoder(torch.nn.Module):
        def __init__(self, features: int, hidden: int):
            # Necessary in order to log C++ API usage and other internals
            super().__init__()
            self.encoder = torch.nn.Linear(features, hidden)
            self.decoder = torch.nn.Linear(hidden, features)
    
        def forward(self, X):
            return self.decoder(self.encoder(X))
    
        def encode(self, X):
            return self.encoder(X)
    
    # Random data
    data = torch.rand(100, 4)
    model = AutoEncoder(4, 10)
    # Pass model.parameters() for increased readability
    # Weights of encoder and decoder will be passed
    optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
    loss_fn = torch.nn.MSELoss()
    
    # Per-epoch losses are gathered
    # Loss is the mean of batch elements, in our case mean of 100 elements
    losses = []
    for epoch in range(1000):
        reconstructed = model(data)
        loss = loss_fn(reconstructed, data)
        # No need to retain_graph=True as you are not performing multiple passes
        # of backpropagation
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
    
        losses.append(loss.item())
    

    Please notice linear autoencoder is roughly equivalent to PCA decomposition, which is more efficient.

    You should probably use a non-linear autoencoder unless it is simply for training purposes.