pythonpytorchfloating-point

Pytorch Convolution Network: problem with floating point type


I'm trying to train the CV-model with standart MNIST-data:

 import torch
 from torchvision.datasets import MNIST
 import torchvision.transforms as transforms

 img_transforms = transforms.Compose([
     transforms.ToTensor(),
     transforms.Normalize((0.1305,), (0.3081,))
 ])

 train_dataset = MNIST(root='../mnist_data/',
                  train=True,
                  download=True,
                  transform=img_transforms)

 train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                       batch_size=10,
                                       shuffle=True)

Model is declared as:

import torch.nn as nn

class MNIST_ConvNet(nn.Module):
  def __init__(self):
     super().__init__()
     self.conv1 = ConvLayer(1, 14, 5, activation=nn.Tanh(),
                           dropout=0.8)
     self.conv2 = ConvLayer(14, 7, 5, activation=nn.Tanh(), flatten=True,
                           dropout=0.8)
     self.dense1 = DenseLayer(28 * 28 * 7, 32, activation=nn.Tanh(),
                             dropout=0.8)
     self.dense2 = DenseLayer(32, 10)

  def forward(self, x: Tensor) -> Tensor:
     assert_dim(x, 4)

     x = self.conv1(x)
     x = self.conv2(x)

     x = self.dense1(x)
     x = self.dense2(x)
     return x

Then I invoke forward and estimate loss for this model, in accordance with pytorch approach:

import torch.optim as optim

model = MNIST_ConvNet()
for X_batch, y_batch in train_dataloader:

                optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
                optimizer.zero_grad()
                output = model(X_batch)[0]
                loss = nn.CrossEntropyLoss()
                loss = loss(output, y_batch)

X_batch has the following content:

tensor([[[[-0.4236, -0.4236, -0.4236,  ..., -0.4236, -0.4236, -0.4236],
      [-0.4236, -0.4236, -0.4236,  ..., -0.4236, -0.4236, -0.4236],
      [-0.4236, -0.4236, -0.4236,  ..., -0.4236, -0.4236, -0.4236],
      ...,

And for this line of code "self.loss(output, y_batch)", I receive the following error:

RuntimeError: Expected floating point type for target with class probabilities, got Long

To solve the problem, I tried update data type:

 self.model(X_batch.type(torch.FloatTensor))[0]

But this does not working.


Solution

  • Before answering the question please do not construct a new optimizer / loss criterion each iteration of the training. The code should be something like the following:

    import torch.optim as optim
    
    model = MNIST_ConvNet()
    optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
    loss = nn.CrossEntropyLoss()
    
    for X_batch, y_batch in train_dataloader:
    
                optimizer.zero_grad()
                output = model(X_batch)  
                loss = loss(output, y_batch)
    

    Additionally in the output indexing [0] seems like a mistake, since you operate on batches, whereas this extracts the prediction of the first batch element.

    If this does not solve it yet you might try casting the tensors to float by doing: y_batch.float() or output.float(). Hope this helps.