pythontensorflowkerastensorflow-datasetstensorflow-estimator

Tensorflow : logits and labels must have the same first dimension


I am new in tensoflow and I want to adapt the MNIST tutorial https://www.tensorflow.org/tutorials/layers with my own data (images of 40x40). This is my model function :

def cnn_model_fn(features, labels, mode):
        # Input Layer
        input_layer = tf.reshape(features, [-1, 40, 40, 1])

        # Convolutional Layer #1
        conv1 = tf.layers.conv2d(
                inputs=input_layer,
                filters=32,
                kernel_size=[5, 5],
                #  To specify that the output tensor should have the same width and height values as the input tensor
                # value can be "same" ou "valid"
                padding="same",
                activation=tf.nn.relu)

        # Pooling Layer #1
        pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)

        # Convolutional Layer #2 and Pooling Layer #2
        conv2 = tf.layers.conv2d(
                inputs=pool1,
                filters=64,
                kernel_size=[5, 5],
                padding="same",
                activation=tf.nn.relu)
        pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)

        # Dense Layer
        pool2_flat = tf.reshape(pool2, [-1, 10 * 10 * 64])
        dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)
        dropout = tf.layers.dropout(
                inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN)

        # Logits Layer
        logits = tf.layers.dense(inputs=dropout, units=2)

        predictions = {
            # Generate predictions (for PREDICT and EVAL mode)
            "classes":       tf.argmax(input=logits, axis=1),
            # Add `softmax_tensor` to the graph. It is used for PREDICT and by the
            # `logging_hook`.
            "probabilities": tf.nn.softmax(logits, name="softmax_tensor")
        }

        if mode == tf.estimator.ModeKeys.PREDICT:
            return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)

        # Calculate Loss (for both TRAIN and EVAL modes)
        loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)

        # Configure the Training Op (for TRAIN mode)
        if mode == tf.estimator.ModeKeys.TRAIN:
            optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
            train_op = optimizer.minimize(
                    loss=loss,
                    global_step=tf.train.get_global_step())
            return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)

        # Add evaluation metrics (for EVAL mode)
        eval_metric_ops = {
            "accuracy": tf.metrics.accuracy(
                    labels=labels, predictions=predictions["classes"])}
        return tf.estimator.EstimatorSpec(
                mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)

I have a shape size error between labels and logits :

InvalidArgumentError (see above for traceback): logits and labels must have the same first dimension, got logits shape [3,2] and labels shape [1]

filenames_array is an array of 16 string

["file1.png", "file2.png", "file3.png", ...]

and labels_array is an array of 16 integer

[0,0,1,1,0,1,0,0,0,...]

The main function is :

# Create the Estimator
mnist_classifier = tf.estimator.Estimator(model_fn=cnn_model_fn, model_dir="/tmp/test_convnet_model")

# Train the model
cust_train_input_fn = lambda: train_input_fn_custom(
        filenames_array=filenames, labels_array=labels, batch_size=1)

mnist_classifier.train(
        input_fn=cust_train_input_fn,
        steps=20000,
        hooks=[logging_hook])

I tried to reshape logits without success :

logits = tf.reshape(logits, [1, 2])

I need your help, thanks


EDIT

After more time to search, in the first line of my model function

input_layer = tf.reshape(features, [-1, 40, 40, 1])

the "-1" that signifies that the batch_size dimension will be dynamically calculated have here the value "3". The same "3" as in my error : logits and labels must have the same first dimension, got logits shape [3,2] and labels shape [1]

If I force the value to "1" I have this new error :

Input to reshape is a tensor with 4800 values, but the requested shape has 1600

Maybe a problem with my features ?


EDIT2 :

the complete code is here : https://gist.github.com/geoffreyp/cc8e97aab1bff4d39e10001118c6322e


EDIT3

I updated the gist with

logits = tf.layers.dense(inputs=dropout, units=1)

https://gist.github.com/geoffreyp/cc8e97aab1bff4d39e10001118c6322e

But I don't completely understand your answer about the batch size, how the batch size can be 3 here whereas I choose a batch size of 1 ?

If I choose a batch_size = 3 I have this error : logits and labels must have the same first dimension, got logits shape [9,1] and labels shape [3]

I tried to reshape labels :

labels = tf.reshape(labels, [3, 1])

and I updated features and labels structure :

    filenames_train = [['blackcorner-data/1.png', 'blackcorner-data/2.png', 'blackcorner-data/3.png',
                   'blackcorner-data/4.png', 'blackcorner-data/n1.png'],
                   ['blackcorner-data/n2.png',
                   'blackcorner-data/n3.png', 'blackcorner-data/n4.png',
                   'blackcorner-data/11.png', 'blackcorner-data/21.png'],
                   ['blackcorner-data/31.png',
                   'blackcorner-data/41.png', 'blackcorner-data/n11.png', 'blackcorner-data/n21.png',
                   'blackcorner-data/n31.png']
                   ]

labels = [[0, 0, 0, 0, 1], [1, 1, 1, 0, 0], [0, 0, 1, 1, 1]]

but without success...


Solution

  • The problem is in your target shape and is related to the correct choice of an appropriate loss function. you have 2 possibilities:

    1. possibility: if you have 1D integer encoded target, you can use sparse_categorical_crossentropy as loss function

    n_class = 3
    n_features = 100
    n_sample = 1000
    
    X = np.random.randint(0,10, (n_sample,n_features))
    y = np.random.randint(0,n_class, n_sample)
    
    inp = Input((n_features,))
    x = Dense(128, activation='relu')(inp)
    out = Dense(n_class, activation='softmax')(x)
    
    model = Model(inp, out)
    model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
    history = model.fit(X, y, epochs=3)
    

    2. possibility: if you have one-hot encoded your target in order to have 2D shape (n_samples, n_class), you can use categorical_crossentropy

    n_class = 3
    n_features = 100
    n_sample = 1000
    
    X = np.random.randint(0,10, (n_sample,n_features))
    y = pd.get_dummies(np.random.randint(0,n_class, n_sample)).values
    
    inp = Input((n_features,))
    x = Dense(128, activation='relu')(inp)
    out = Dense(n_class, activation='softmax')(x)
    
    model = Model(inp, out)
    model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
    history = model.fit(X, y, epochs=3)