pythontensorflowkerasgeneratortensorflow-datasets

Tensorflow variable image input size (autoencoder, upscaling ...)


Edit: WARNING Using images with different image sizes is not recommended, as tensors need to have same size to allow for parallelization.

I've been looking all over to find a solution, on how to use images of different size as my input for a NN.

Numpy

First idea was to use numpy. However because of the different sizes of each image, I wasn't able to use this, as tensorflow wouldn't accept numpy.ndarray.

Trying a simple list didn't work either, as it isn't supported.

Dataset generator

Tried implementing a custom generator, with yield, but ran into loads of errors:
Error occurred when finalizing GeneratorDataset iterator: Cancelled: Operation was cancelled
tensorflow y argument is not supported when using dataset as input
No gradients provided for any variable
generator yielded an element of shape (50, 50, 3) where an element of shape (None, None, None, 3) was expected
tensorflow cannot convert to tensorflow dtype
tensorflow placeholder is not compatible with eager execution

These and other errors occured while trying different solutions on how to implement the generator (from SO and other sites).

File strucutre

/1
  -0.png
  -1.png
/2
  -0.png
  -1.png
/3
  -0.png
  -1.png

Images inside folder 1 are 50x50 px, 2 are 100x100 px and 3 are 200x200 px.

Upscaling model

input_img = keras.Input(shape=(None, None, 3))

upscaled = layers.UpSampling2D((2, 2), interpolation='bilinear')(input_img)
out = layers.Conv2D(3, (3, 3), activation='sigmoid', padding='same')(upscaled)

conv_model = keras.Model(input_img, out)
conv_model.compile(optimizer='adam', loss=tf.keras.losses.MeanSquaredError())

Solution

  • Edit: WARNING Using images with different image sizes is not recommended, as tensors need to have same size to allow for parallelization.

    After hours of working on this, I've come to a solution. My specific answer takes an input image, and the target is to upscale it 2x.

    Loading the paths to all input and output (target) data:

    path = 'path_to_parent_dir'
    in_paths = [path + '/1/' + f for f in ['0.png', '1.png']] + [path + '/2/' + f for f in ['0.png', '1.png']]
    out_paths = [path + '/2/' + f for f in ['0.png', '1.png']] + [path + '/3/' + f for f in ['0.png', '1.png']]
    

    Generator:

    def data_generator(in_paths, out_paths):
    
        for i in range(len(in_paths)):
            yield cv2.imread(in_paths[i]) / 255, cv2.imread(out_paths[i]) / 255
    

    Converting to dataset

    train_dataset = tf.data.Dataset.from_generator(
        lambda: data_generator(in_paths, out_paths), 
        output_types=(tf.float32, tf.float32), 
        output_shapes=((None, None, 3), (None, None, 3))
    ).batch(1)
    
    validate_dataset = tf.data.Dataset.from_generator(
        lambda: data_generator(in_paths, out_paths), 
        output_types=(tf.float32, tf.float32), 
        output_shapes=((None, None, 3), (None, None, 3))
    ).batch(1)
    

    The lambda function is necessary, because the from_generator doesn't accept a generator, but a reference to the function itself (so no parameters). It's possible to use args=() inside the from_generator, but in my case, the data (paths) got converted to Bytes-like object, so it didn't work for me.

    warning

    This is only an example, and it uses the same data for both training and validation (which is stupid). Please use different data for each when adapting this solution.

    Training

    conv_model.fit(
        train_dataset,
        epochs=1,
        validation_data=validate_dataset
    )
    

    Auto shard policy

    This workflow produces a really long warning message after each epoch (or during, or at random times, really), suggesting to either turn off auto-sharding or switch the auto_shard_policy to DATA to to shard the dataset.

    But it's only a warning, so it works even with it. There is a solution on how to disable this: so link

    Alternative

    I've also found an alternative way to make this work. The difference is, that it has to generate a different kind of output (a tuple of tuples). Not sure which way is the correct one, or if they are equivalent either.

    def data_generator_(in_paths, out_paths):
    
        for i in range(len(in_paths)):
            yield (cv2.imread(in_paths[i]) / 255, ), (cv2.imread(out_paths[i]) / 255, )
    
    train_dataset = tf.data.Dataset.from_generator(
        lambda: data_generator_2(in_paths, out_paths), 
        output_types=(tf.float32, tf.float32), 
        output_shapes=((None, None, None, 3), (None, None, None, 3))
    )
    
    validate_dataset = tf.data.Dataset.from_generator(
        lambda: data_generator_2(in_paths, out_paths), 
        output_types=(tf.float32, tf.float32), 
        output_shapes=((None, None, None, 3), (None, None, None, 3))
    )
    
    conv_model.fit(
        train_dataset,
        epochs=1,
        batch_size=1,
        validation_data=validate_dataset
    )
    

    This was a real pain to figure out, hope it helps someone.