pythonmachine-learningpytorchdatalore

Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! on DataLore


I'm working with IntelliJ DataLore to train a basic VGG16 CNN, but when I try to do it using a GPU machine I get the following error:

Traceback (most recent call last):
  at block 20, line 1
  at /data/workspace_files/train/trainer/training.py, line 115, in train(self, max_epochs)
  at /data/workspace_files/train/trainer/training.py, line 46, in train_epoch(self, train_loader)
  at /data/workspace_files/train/trainer/training.py, line 94, in forward_to_loss(self, step_images, step_labels)
  at /opt/python/envs/default/lib/python3.8/site-packages/torch/nn/modules/module.py, line 1102, in _call_impl(self, *input, **kwargs)
  at /data/workspace_files/models/vgg.py, line 49, in forward(self, x)
  at /opt/python/envs/default/lib/python3.8/site-packages/torch/nn/modules/module.py, line 1102, in _call_impl(self, *input, **kwargs)
  at /opt/python/envs/default/lib/python3.8/site-packages/torch/nn/modules/container.py, line 141, in forward(self, input)
  at /opt/python/envs/default/lib/python3.8/site-packages/torch/nn/modules/module.py, line 1102, in _call_impl(self, *input, **kwargs)
  at /opt/python/envs/default/lib/python3.8/site-packages/torch/nn/modules/linear.py, line 103, in forward(self, input)
  at /opt/python/envs/default/lib/python3.8/site-packages/torch/nn/functional.py, line 1848, in linear(input, weight, bias)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_addmm)

Here is my code so you guys can review it.

use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
model = model.to(device)

In this fragment of code I use self.device because I pass the device as parameter to the class Train

for _, (data, target) in tqdm(enumerate(train_loader, 1)):
            self.optimizer.zero_grad()
            step_images, step_labels = data.to(
                self.device), target.to(self.device)
            step_output, loss = self.forward_to_loss(step_images, step_labels)

I haven't had this issue before so I don't know if there something missing on DataLore or my code is wrong.

Hope you can help me!


Solution

  • can you try this

    step_output, loss = self.forward_to_loss(step_images.to(self.device), step_labels.to(self.device))