tensorflowkerasdeep-learningtf.keras

How can I correct an input_shape error produced when following the "Overfit and underfit" TensorFlow tutorial?


I am working through the "Overfit and underfit" TensorFlow tutorial and have reproduced all steps up to the tiny_model using the code they supply. However, when I run the command

size_histories['Tiny'] = compile_and_fit(tiny_model, 'sizes/Tiny')

I am served the error, Invalid input shape for input Tensor("data:0", shape=(28,), dtype=float32). Expected shape (None, 28), but input has incompatible shape (28,). I do not understand why the "expected shape" is (None, 28) when the tutorial and I specify that it should be input_shape=(FEATURES,) in the definition of the tiny_model. (We earlier defined FEATURES = 28.)

What am I missing here? My understanding of model construction in TensorFlow is still shaky, so the answer could be simple.


Solution

  • Going through the tutorial, it seems you maybe forgot to include the lines

    validate_ds = validate_ds.batch(BATCH_SIZE)
    train_ds = train_ds.shuffle(BUFFER_SIZE).repeat().batch(BATCH_SIZE)
    

    which are just before this part (scroll a bit up). The important part for your error message is the .batch() call, which divides the training (and validation) dataset into batches.

    The confusing bit in the beginning is that with input_shape=(FEATURES,) one sets the shape for one sample for the network (in this step you ignore the batch_size), but when you call model.fit(...), it expects batches of data. Thats the (None, features) shape. It is (None, features) and not (batch_size, features) because TensorFlow handles the batch size as a variable size (None stands for variable size in this context). This is mostly done just for the last batch, which is in most cases smaller than the real batch size.
    For example, if you have 40 samples and batch size 16: You have batches of 16, 16, 8 data points, because in the last batch is not enough data to fill it completely to 16.

    The error you got indicates unbatched data, so the network gets one sample at a time. But the network in TensorFlow always needs a batch size, even if it is just 1 for one sample.

    Edit: To solve the error ValueError: Arguments target and output must have the same rank (ndim): The problem here is that the targets has to be in shape (batch_size, labels). If the targets have only one value (as in this case), the list often gets flattened, and the second labels dim is not there. To solve this, do use one of the two solutions:
    If your data is in numpy arrays/tensorflow tensors etc, you can use

    labels = labels.reshape((-1, 1))  # or similar functions
    

    If you already have a Dataset, use

    ds = ds.map(lambda x,y: (x, tf.expand_dims(y, 0)))
    

    This maps the dimension expansion to every element of the dataset.