I'm trying to implement a pattern recognition model using a fully convolutional network (fig 1 in https://www.sciencedirect.com/science/article/pii/S0031320318304370, I was able to get the full text without signing in or anything but if it's a problem I can attach a picture too!) but I'm getting a size error when moving from the final Conv2D layer to the first fc_layer.
Here is my error message:
RuntimeError: size mismatch, m1: [4 x 1024], m2: [4 x 1024] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:283
Originally, as in the figure, my first linear layer was:
nn.Linear(4*4*512, 1024)
but after getting the size mismatch, I changed it to:
nn.Linear(4,1024)
Now, I have a strange error message as written above.
For reference (if it helps), here is my code:
import torch.nn as nn
import torch.utils.model_zoo as model_zoo
class convnet(nn.Module):
def __init__(self, num_classes=1000):
super(convnet, self).__init__()
self.features = nn.Sequential(
nn.Conv2d(1, 64, kernel_size=3, stride=2, padding=1),
nn.ReLU(inplace=True),
nn.Conv2d(64, 64, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.Conv2d(64, 64, kernel_size=3, padding=1),
nn.MaxPool2d(kernel_size=1),
nn.Conv2d(64, 128, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.Conv2d(128, 128, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=2),# stride=2),
nn.Conv2d(128, 256, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.Conv2d(256, 256, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=2), #stride=2),
nn.Conv2d(256, 512, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.Conv2d(512, 512, kernel_size=3, padding=1),
nn.ReLU(inplace=True), #nn.Dropout(p=0.5)
)
self.classifier = nn.Sequential(
nn.Linear(4, 1024),
nn.Dropout(p=0.5),
nn.ReLU(inplace=True),
#nn.Dropout(p=0.5),
nn.Linear(1024, 1024),
nn.ReLU(inplace=True),
nn.Linear(1024, num_classes),
)
def forward(self, x):
x = self.features(x)
x = torch.flatten(x,1)
x = self.classifier(x)
return x
I suspect it's an issue with the padding and stride. Thanks!
The error is from a matrix multiplication, where m1
should be an m x n matrix and m2
an n x p matrix and the result would be an m x p matrix. In your case it's 4 x 1024 and 4 x 1024, but that doesn't work since 1024 != 4
.
That means your input to the first linear layer has size [4, 1024] (4 being the batch size), therefore the input features of the first linear layer should be 1024.
self.classifier = nn.Sequential(
nn.Linear(1024, 1024),
nn.Dropout(p=0.5),
nn.ReLU(inplace=True),
#nn.Dropout(p=0.5),
nn.Linear(1024, 1024),
nn.ReLU(inplace=True),
nn.Linear(1024, num_classes),
)
If you are uncertain how many features your input has, you can print out its size just before the layer:
x = self.features(x)
x = torch.flatten(x,1)
print(x.size()) # => torch.Size([4, 1024])
x = self.classifier(x)