So basically this are the dimensions of the weights from trained caffenet:
conv1: (96,3,11,11) conv2: (256,48,5,5) conv3:(384,256,3,3) conv4: (384,192,3,3) conv5:(256, 192, 3 , 3)
I am confused that although conv1 gives 96 channels as output why does conv2 only considers 48 while convolution? Am I missing something?
Yes, you missed the parameter 'group'. The convolution_param defined in the conv2 layer is given below.You can find out that parameter group is set to 2 as grouping the convolution layer can save gpu memory.
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}