pytorchautogradpolynomial-approximations

An error while trying to implement a polynomial regression with Pytorch - Gradients are None after loss.backward()


I am trying to implement a custom polynomial regression using PyTorch but during the training procedure my implementation fails to calculate the gradients; i.e. the weights are always None even after the loss.backward() command. Below I give all the necessary details.

Step 1 I generate some data points with the following commands:

import numpy as np
import torch
import matplotlib.pyplot as plt
from torch.autograd import Function
import torch.nn as nn
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
seed_value = 42
np.random.seed(seed_value)
x = np.sort(np.random.rand(1000))
y = np.cos(1.2 * x * np.pi) + (0.1 * np.random.randn(1000))

and then I use train-test split from sklearn to split my data into training and test sets.

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(x,y,train_size = 0.7,
                                                   random_state = seed_value)

Step 2 I create the custom function named poly which returns the value of the polynomial p(x)=w0+w1x+...w5x^5, evaluated at x for given weights w.

def poly(x,w,batch_size = 10,degree = 5):
    x = x.repeat(1,degree+1)
    w = w.repeat(batch_size,1)
    exp = torch.arange(0.,degree+1).repeat(batch_size,1)
    return torch.sum(w*torch.pow(x,exp),dim=1)

Step 3 I construct the class custom_dataset which inherits from PyTorch's dataset to handle my training into batches.

class custom_dataset(Dataset):
    def __init__(self,X,y):
        self.x = torch.from_numpy(X).type(torch.float32).reshape(len(X),1)
        self.y = torch.from_numpy(y).type(torch.float32)
    def __len__(self):
        return len(self.x)
    def __getitem__(self,idx):
        return self.x[idx], self.y[idx]

Step 4 I construct the loop handling the training procedure.

training_data = custom_dataset(X_train,y_train)
test_data = custom_dataset(X_test,y_test)
def training_loop(train_loader, w, epochs, lr, batch_size,
                  loss_fn = nn.MSELoss(), degree = 5):
    weights = torch.tensor(w,dtype = torch.float32, requires_grad = True)
    num_batches = len(train_loader)//batch_size
    for epoch in range(1,epochs+1):
        print(f"{5*'-'}>epoch:{epoch}<{5*'-'}")
        for i,sample in enumerate(train_loader):
            x,y = sample
            y_preds = poly(x,weights,batch_size = batch_size)
            loss = loss_fn(y,y_preds)
            loss.backward() # backpropagation
            weights = weights - lr*weights.grad # update - gradient descent
            
            if (i+1) % 100 == 0:
                print(f"- Batch:[{i+1}|{num_batches}]{5*' '}Samples:[{(i+1)*num_batches}|{len(train_loader)}]{5*' '}loss:{loss.item():.6f}")         
    return w

Step 5 I start training...

epochs = 10
lr = 1e-3
batch_size = 10
degree = 5
train_loader = DataLoader(training_data, batch_size = batch_size,
                         shuffle = True)
test_loader = DataLoader(test_data, batch_size = batch_size,
                        shuffle = True)
w = [0]*(degree+1)
w = training_loop(train_loader, w = w, epochs = 30, lr = lr,
                  batch_size = batch_size)

and getting the following error

---------------------------------------------------------------------------  TypeError                                 Traceback (most recent call last) Input In [40], in <cell line: 10>()
      7 test_loader = DataLoader(test_data, batch_size = batch_size,
      8                         shuffle = True)
      9 w = [0]*(degree+1)
---> 10 w = training_loop(train_loader, w = w, epochs = 30, lr = lr,
     11                   batch_size = batch_size)

Input In [39], in training_loop(train_loader, w, epochs, lr, batch_size, loss_fn, degree)
     10 loss = loss_fn(y,y_preds)
     11 loss.backward() # backpropagation
---> 12 weights = weights - lr*weights.grad # update - gradient descent
     14 if (i+1) % 100 == 0:
     15     print(f"- Batch:[{i+1}|{num_batches}{5*' '}Samples:[{(i+1)*num_batches}|{len(train_loader)}]{5*' '}loss:{loss.item():.6f}")         

TypeError: unsupported operand type(s) for *: 'float' and 'NoneType'

Which means that the computation of the gradients did not affect the variable weights as it still set to None. Do you have any idea what is wrong?


Solution

  • You are overwriting the weights variable on your first loop iteration which will be replaced with a copy of weights without the grad attribute. This behavior can be reproduced with the following minimal code:

    >>> weights.grad =  torch.ones_like(weights)
    >>> for i in range(2):
    ...     print(weights.grad)
    ...     weights = weights - weights.grad
    
    tensor([1., 1.])
    None 
    

    To fix this, you can update the parameter using an in-place operation:

            weights -= lr*weights.grad # update - gradient descent