I am building a CNN in Pytorch. Below is the code I would use for grayscale input images:
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
# 1x1x28x28 to 32x1x28x28
self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, stride=1, padding=1)
# 32x1x28x28 to 64x1x28x28
self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, stride=1, padding=1)
# 64x1x28x28 to 64x1x14x14
self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)
# 64x1x14x14 to 128x1x14x14
self.conv3 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, stride=1, padding=1)
# 128x1x14x14 to 128x1x7x7
self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)
# 128x1x7x7 to 128
self.fc1 = nn.Linear(in_features=128*7*7, out_features=128)
# 128 to 27 (no. of classes)
self.fc2 = nn.Linear(in_features=128, out_features=27)
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.relu(self.conv2(x))
x = self.pool1(x)
x = F.relu(self.conv3(x))
x = self.pool2(x)
x = x.view(-1, 128*7*7)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
return x
Do I have to change any of this code to adapt it to RGB images? If so, what are these changes?
FYI:
Thank you so much.
CNN only accepts tensors of 4 dims. Such as your 1x1x28x28. If you really have identical problem (classification, num of classes, image size are identical), basically, only the number of channels changes from 1 to 3. Therefore input dims should be 1x3x28x28.
Also, you can check out this two articles from Pytorch: