kerasconv-neural-networklarge-language-modelimage-classification

ValueError: Input 0 of layer "sequential_7" is incompatible with the layer: expected shape=(None, 224, 224, 1), found shape=(None, 244, 1)


Introduction :

Constraints :

Issue : The expectation is to feed the model with an image and expect a prediction (probability). However, when calling model.predict, the following error is thrown :

"ValueError: Input 0 of layer "sequential_7" is incompatible with the layer: expected shape=(None, 224, 224, 1), found shape=(None, 244, 1)"

Below are the code snippets in the order :

  1. Data Pipeline (reading csv, resizing images)
  2. Create and train model
  3. Prediction using a single image
# 1. DATA PIPELINE

CLASS_NAMES = ['loose', 'control']

def decode_csv(csv_row): # csv_row consists of a file path and the image class
    
    record_defaults = ["path", "image class"] # Default values for the dataset
    filename, label_string = tf.io.decode_csv(csv_row, record_defaults) # tf.io.decode_csv reads every row in the csv
    
    image_bytes = tf.io.read_file(filename=filename) # output: base64 image string
    image_bytes = tf.image.decode_jpeg(image_bytes) # output: an integer array
    image_bytes = tf.image.convert_image_dtype(image_bytes, tf.float32) # output: 0 - 1 range float
    image_bytes = tf.image.resize(image_bytes, [224, 224]) # output: image dimension
    
    label = tf.math.equal(CLASS_NAMES, label_string) # formats label to a boolean array with a truth value corresponding to the output class
    
    return image_bytes, label # Returning a base64 image string and a boolean array with True corresponding to a particular class

def load_dataset(csv_file, batch_size, training=True):
    ds = tf.data.TextLineDataset(filenames=csv_file).skip(1) # skip(1) will remove the top row i.e. header
    ds = ds.map(decode_csv).cache()
    ds = ds.batch(batch_size=batch_size)
    
    if training:
        ds = ds.shuffle(10).repeat()
    return ds

train_ds = load_dataset("gs://qwiklabs-asl-04-06351f77b64f-hip-implant/hip-implant-data.csv", batch_size = 10)

validation_data = load_dataset("gs://qwiklabs-asl-04-06351f77b64f-hip-implant/hip-implant-data.csv", batch_size = 10, training=False)

# 2. CREATE MODEL

IMG_HEIGHT = 224
IMG_WIDTH = 224
IMG_CHANNELS = 64

model = Sequential([
    Conv2D(name="first-Conv2D-layer",filters=64, kernel_size=3, input_shape=(IMG_WIDTH, IMG_HEIGHT, 1), padding='same', activation='relu'),
    MaxPooling2D(name="first-pooling-layer",strides=2, padding='same'),
    Conv2D(name="second-Conv2D-layer", filters=32, kernel_size=3, activation='relu'),
    MaxPooling2D(name="second-pooling-layer", strides=2, padding='same'),
    Flatten(),
    Dense(units=400, activation='relu'),
    Dense(units=100, activation='relu'),
    Dropout(0.25),
    Dense(2),
    Softmax()
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

Model Summary

enter image description here

# 3. PREDICTION USING A SINGLE IMAGE:

image_path = tf.io.read_file("gs://qwiklabs-asl-04-06351f77b64f-hip-implant/Control/control (25).png")

new_image = decode_img(image_path, [244, 244])

print(new_image.shape)
plt.imshow(new_image.numpy())

prediction = model.predict(new_image)
print(prediction)

Resolutions already tried :

  1. Tried keeping the padding='same' in Convolutional layers (In response to an initial error which stated mismatch in dimensions within the convolutional layer )
  2. Tried explicitly mentioning the input shape to the model (244,244,1) (adding the layer Input(shape=(244,244,1)))
  3. Tried changing the filter size/ units/ pool size (In response to another error which stated that the layer cannot reduce dimensionality further).

Edit 1 : Missed mentioning the decode_img function which resizes the test image (the single image we are trying to predict with)

img = tf.io.read_file("gs://qwiklabs-asl-04-06351f77b64f-hip-implant/Control/control (25).png")

def decode_img(img, reshape_dims):
    img = tf.image.decode_jpeg(img) # tf.image.decode_jpeg can decode Base64 image string into an integer array
    #print("\n tf.image.decode_jpeg : Convert base64 image string into an integer array \n")
    #print(img)
    img = tf.image.convert_image_dtype(img, tf.float32) # tf.image.convert_image_dtype can cast the integer array into 0 -1 range float
    #print("\n tf.image.convert_image_dtype : Cast the integer array into 0 - 1 range float \n")
    #print(img)
    img = tf.image.resize(img, reshape_dims) # tf.image.resize can make image dimensions consistent for our neural network
    #print("\n tf.image.resize : Keep image dimensions consistent for our neural network \n")
    #print(img)
    return img


img = decode_img(img, [224, 224])

plt.imshow(img.numpy())

Solution

  • Credit to the user with the resolution : https://www.reddit.com/r/learnmachinelearning/comments/18iaf0b/comment/kdbxjqw/?context=3

    "This error is most common when the data you're feeding to a network is the wrong shape. If each sample is size (224, 224, 1), then your input must be rank 4, with shape (n_batch, 224, 224, 1).

    If you are trying to test the model with one image, you might accidentally feed it a tensor with shape (224, 224, 1). This is wrong, the correct way to feed in one image is with a shape (1, 224, 224, 1).

    You can use numpy.stack, which given several (224, 224, 1) arrays, can combine them alone axis = 0 to a (n_batch, 224, 224, 1) shape."

    Here, since the expectation was to feed a single image to get a prediction, I stacked the single image alone using numpy.stack((image,), axis=0) and used it to make a prediction.