[SOLVED] How to use torch.Tensor.permute in torchvision model

How to use torch.Tensor.permute in torchvision model

I am using a Resnet50 classification model from torchvision which by default accepts images as inputs. I want to make the model accept numpy files (.npy) as inputs. I understand the two have different dimensions as the numpy data is given as

[batch_size, depth, height, width, channels]

instead of

[batch_size, channels, depth, height, width].

Based on this answer, I can use the permute function to change the order of the dimensions. However, I can't find any solution or leads on how to do this in a torchvision model.

Solution

Let's say you have a tensor x with the dimensions

[batch_size, depth, height, width, channels]

and you want to get a tensor y with dimensions

[batch_size, channels, depth, height, width]

The permute() method reorders these dimensions. You have to specify the order in which the original dimensions should be reordered to get the new ones, that is

y = x.permute([0, 4, 1, 2, 3])

If we analyze that, in the original tensor x the dimensions were enuemrated as

[batch_size, depth, height, width, channels]
 0           1      2       3      4

In the new tensor we therefore get

[batch_size, channels, depth, height, width]
 0           4         1      2       3

which is what we need to pass to permute()

Alternatively you can also just use the einsum() function, where you can just type the signatures which is much more intuitive.

y = torch.einsum('b d h w c -> b c d h w', x)