pythontensorflowkerasconv-neural-networkstride

how strides effect input shapes in keras?


I'm making a simple image classification in keras and I used MaxPooling2D to reduce image sizes. Recently I learned about strides and I want to implement them but I run into errors. Here's a piece of code which gives errors:

early_stopping = EarlyStopping(monitor = 'val_loss',min_delta = 0.001, patience = 20, restore_best_weights = True)
model = tf.keras.Sequential()

model.add(tf.keras.layers.Conv2D(512, (2, 2),input_shape=(X[0].shape), strides = 2, data_format='channels_first', activation = 'relu'))

model.add(tf.keras.layers.MaxPooling2D(pool_size=(3, 3)))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Conv2D(512, (3, 3), data_format='channels_first',activation = 'relu'))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(3, 3)))
model.add(tf.keras.layers.Dropout(0.5))

model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Conv2D(128, (3, 3), data_format='channels_first',activation = 'relu'))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(4, 4)))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Flatten())  

model.add(tf.keras.layers.Dense(128))

model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

opt = keras.optimizers.Adam(learning_rate=0.0005)
model.compile(loss='binary_crossentropy',
              optimizer=opt,
              metrics=['accuracy'])

h= model.fit(trainx, trainy, validation_data = (valx, valy), batch_size=64, epochs=80, callbacks = [early_stopping], verbose = 0)

Here's the error:

ValueError: Negative dimension size caused by subtracting 4 from 2 for '{{node max_pooling2d_35/MaxPool}} = MaxPool[T=DT_FLOAT, data_format="NHWC", explicit_paddings=[], ksize=[1, 4, 4, 1], padding="VALID", strides=[1, 4, 4, 1]](Placeholder)' with input shapes: [?,128,2,46].

when I remove 'strides = 2' everything works just fine. Why is strides option causing input shape error and how can I prevent it? I couldn't find any info about that.


Solution

  • Stride is how much a kernel is shifted every time. A stride of size 2 essentially cuts the dimensions of the input block in half along each axis. Seems like you have an image of size 128 by 2 at some point due to your convolutions and strides. Of course you can't place a 4 x 4 filter on it since the dimension is only 2 on one axis.

    You can use padding here to pad the data, I believe with 0s, to bring the dimensions up 128 by 4 and avoid the error.