I'm trying to convert a trained sign language classification solution in python to a C language headers so that I can deploy on a M4-cortex CPU board. In Python, I'm able to build model and train it and I can see it predicting with 90% accuracy. But I see an issue with number of weights used/generated in convolution layers
**Conv_1d configuration**
print(x_train.shape)
model = Sequential()
model.add(Conv1D(32,kernel_size=5, padding='same',
input_shape=x_train.shape[1:], name='conv1d_1'))
print(model.layers[0].kernel.numpy().shape)
**output:**
(1742, 45, 45)
**(5, 45, 32)**
According to above configuration
input dimension = 45x45x1 pixels of image(gray scale)
input channels = 1
output dimension = 45x45x32
output channesls = 32
kernel size = 5
As per the concept(w.r.t https://cs231n.github.io/convolutional-networks/)
number of weights = (input_channels) x (kernel_size) x (kernel_size) x (output_channels)=1x5x5x32=800
But keras model produces weights array of size = [5][45][32]=7200
I'm not sure if my interpretation of weight array in keras model is correct, I would be glad if someone can help me with this
Some bullets that should clarify your doubts.
You're formula for the number of weights can't be right because you're using a Conv1D
, so the kernel size has only one dimension.
Defining the input shape x_train.shape[1:] = (45,45)
corresponds to 45 filters applied on an array with 45 elements (again because it's a Conv1D
).
Said so, the number of weights is:
# of weights = input_filters x kernel_size x output_filters = 45x5x32 = 7200
(without biases)
Considering that you have images, probably you're looking for Conv2D
. In this case, the input shape should be (45,45,1)
, the kernel has two dimensions, and the number of parameters is exactly 800 (without biases)
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv2D(32,kernel_size=5, padding='same',
input_shape=(45, 45, 1), use_bias=False))
model.summary()
# Layer (type) Output Shape Param #
# conv (Conv2D) (None, 45, 45, 32) 800