I am a newbie in ML. I am trying to implement a model from article "Swimming Style Recognition and Lap Counting Using a Smartwatch and Deep Learning" (https://doi.org/10.1145/3341163.3347719). Input of the model consist of 11 channels windowed data with size of 180. But after first conv layer and max pooling they have tensor with consist of 11 layers and window size equals 59, but there is also another dimension with 64 feature maps. But authors used only conv1d with kernel size 3x1.
I am failed to implement such kernel using nn.Conv1d. How can I do that?
A convolution with kernel size 3x1
is not a 1D conv, it's a 2D conv:
conv = nn.Conv2d(1,64,(3,1))
maxpool = nn.MaxPool2d((3,1))
Look at an inference with a single channel 180x11
input:
>>> maxpool(conv(torch.rand(1,1,180,11))).shape
torch.Size([1, 64, 59, 11])
This matches the shape of the "Conv. layer 1" output shown in the figure above.