[SOLVED] einops rearrange vs torch view

einops rearrange vs torch view

I am currently implementing the LoFTR model and came across the following code:

feature_c0.shape
-> torch.Size([1, 256, 60, 60])

rearrange(feature_c0, 'n c h w -> n (h w) c').shape
-> torch.Size([1, 3600, 256])

feature_c0.view(1, -1, 256).shape
-> torch.Size([1, 3600, 256])

I thought I understood the functionality of both, tensor.view and rearrange. The problem: the output of these 2 is different, even if their shape is the same. I don't really understand what is going on here.

Solution

The torch.view automatically reshape the inner dimension to fit the output dimension especially using -1 index.

For example,

x = torch.arange(24)
x = x.view(1, 2, 3, 4)
>
tensor([[[[ 0,  1,  2,  3],
          [ 4,  5,  6,  7],
          [ 8,  9, 10, 11]],

         [[12, 13, 14, 15],
          [16, 17, 18, 19],
          [20, 21, 22, 23]]]])

x_res = x.view(1, -1, 6) # x_res.shape = [1, 4, 6]
>
tensor([[[ 0,  1,  2,  3,  4,  5],
         [ 6,  7,  8,  9, 10, 11],
         [12, 13, 14, 15, 16, 17],
         [18, 19, 20, 21, 22, 23]]])

x_res  = rearrange(x, 'a b c d -> a (specified_b) specified_c') # raise error!

using tensor.view() is still possible to reshape to "last_dimension=6" with the order of tail tensor, while rearrange() should involve specified dimension to be reshaped, divided or grouped.
In your case, the 256 * 60 * 60 is somehow grouped into [x * 256] in the order of last dimension, not [(60*60) * 256] you wanted.

As a result, rearrange is more specified function in your case.