My model is:
def forward(self, x):
x = self.first_bn(x)
x = self.selu(x)
x0 = self.block0(x)
y0 = self.avgpool(x0).view(x0.size(0), -1)
y0 = self.fc_attention0(y0)
y0 = self.sig(y0).view(y0.size(0), y0.size(1), -1)
y0 = y0.unsqueeze(-1)
x = x0 * y0 + y0
x = nn.MaxPool2d(2)(x)
x2 = self.block2(x)
y2 = self.avgpool(x2).view(x2.size(0), -1)
y2 = self.fc_attention2(y2)
y2 = self.sig(y2).view(y2.size(0), y2.size(1), -1)
y2 = y2.unsqueeze(-1)
x = x2 * y2 + y2
x = nn.MaxPool2d(2)(x)
x4 = self.block4(x)
y4 = self.avgpool(x4).view(x4.size(0), -1)
y4 = self.fc_attention4(y4)
y4 = self.sig(y4).view(y4.size(0), y4.size(1), -1)
y4 = y4.unsqueeze(-1)
x = x4 * y4 + y4
x = nn.MaxPool2d(2)(x)
x = self.bn_before_gru(x)
x = self.selu(x)
x = x.squeeze(-2)
x = x.permute(0, 2, 1)
x, _ = self.gru(x)
x = x[:, -1, :]
x = self.fc1_gru(x)
x = self.fc2_gru(x)
return x
def _make_attention_fc(self, in_features, l_out_features):
l_fc = []
l_fc.append(nn.Linear(in_features=in_features, out_features=l_out_features))
return nn.Sequential(*l_fc)
to solve
RuntimeError: Given input size: (64x1x1). Calculated output size: (64x0x0). Output size is too small
this error please give solution
The primary issue lies in your input size.
If you examine the SpecRNet
architecture, you'll notice that it includes some MaxPool2d
Let's consider an example where we input a tensor with the size (8, 1, 64, 64)
Here are the outputs of each layer within the SpecRNet
INPUT: torch.Size([8, 1, 64, 64])
first_bn(x): torch.Size([8, 1, 64, 64])
selu(x): torch.Size([8, 1, 64, 64])
block0(x): torch.Size([8, 20, 32, 32]) ######
avgpool(x0).view(x0.size(0), -1): torch.Size([8, 20])
fc_attention0(y0): torch.Size([8, 20])
sig(y0).view(y0.size(0), y0.size(1), -1): torch.Size([8, 20, 1])
unsqueeze(-1): torch.Size([8, 20, 1, 1])
x0 * y0 + y0: torch.Size([8, 20, 32, 32])
MaxPool2d(2)(x): torch.Size([8, 20, 16, 16]) ######
block2(x): torch.Size([8, 64, 8, 8]) ######
avgpool(x2).view(x2.size(0), -1): torch.Size([8, 64])
fc_attention2(y2): torch.Size([8, 64])
sig(y2).view(y2.size(0), y2.size(1), -1): torch.Size([8, 64, 1])
unsqueeze(-1): torch.Size([8, 64, 1, 1])
x2 * y2 + y2: torch.Size([8, 64, 8, 8])
MaxPool2d(2)(x): torch.Size([8, 64, 4, 4]) ######
block4(x): torch.Size([8, 64, 2, 2]) ######
avgpool(x4).view(x4.size(0), -1): torch.Size([8, 64])
fc_attention4(y4): torch.Size([8, 64])
sig(y4).view(y4.size(0), y4.size(1), -1): torch.Size([8, 64, 1])
unsqueeze(-1): torch.Size([8, 64, 1, 1])
x4 * y4 + y4: torch.Size([8, 64, 2, 2])
MaxPool2d(2)(x): torch.Size([8, 64, 1, 1]) ######
bn_before_gru(x): torch.Size([8, 64, 1, 1])
selu(x): torch.Size([8, 64, 1, 1])
squeeze(-2) torch.Size([8, 64, 1])
permute(0, 2, 1): torch.Size([8, 1, 64])
gru(x): torch.Size([8, 1, 128])
fc1_gru(x): torch.Size([8, 128])
fc2_gru(x): torch.Size([8, 1])
OUTPUT: torch.Size([8, 1])
We observe that the shape is halved after passing through block0
, block2
, block4
, and undergoing MaxPool2d
Since SpecRNet
utilizes block0
, block2
, block4
, and applies MaxPool2d
3 times, your input size should ideally be 2^6, which equals 64.
On the other hand, because you define your model architecture in
def get_specrnet_config(input_channels: int) -> Dict:
return {
"filts": [input_channels, [input_channels, 20], [20, 64], [64, 64]],
"nb_fc_node": 64,
"gru_node": 64,
"nb_gru_layer": 2,
"nb_classes": 1,
specrnet_config = get_specrnet_config(input_channels=1)
It means that your input channel is 1.
In summation, your input size should be (batch_size,1,64,64)