kerastensorflow2.0tf.kerasdata-augmentationdata-generation

Keras - Is there a way to manage the filenames generated by the flow_from_directory function of ImageDataGenerator?


As the title is self-descriptive, I need to keep the original filenames of my images after the data augmentation, which is handled by the flow_from_directory function of the ImageDataGenerator class of Keras. The reason behind this requirement is that the filenames actually represent the labels and I'll move these new images into the respective folders through their names. Please feel free to ask for any further information.

Here are my ImageDataGenerator and how I handle the task:

aug = ImageDataGenerator(
    rotation_range=20,
    zoom_range=0.15,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.15,
    horizontal_flip=True,
    fill_mode="nearest")

i = 0
for batch in aug.flow_from_directory(extract_dir, batch_size=1, color_mode='grayscale', target_size=(28, 28),
                                     save_to_dir=extract_dir + '/augmented', save_prefix='aug'):
    i += 1
    if i == 100:
        break

Solution

  • Unfortunately, there is not an easy way to access to the filenames from the ImageDataGenerator.flow_from_directory iterator. Instead, you can use ImageDataGenerator.flow to apply your augmentations to an image, then save the augmented image manually using another image processing library, e.g. cv2.

    import os
    os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' # suppress Tensorflow messages
    from keras.preprocessing.image import ImageDataGenerator
    import cv2
    import numpy as np
    
    
    image_folder = 'img_folder' # your image folder
    print(f'my images: {os.listdir(image_folder)}')
    
    
    aug_image_folder = 'aug_img_folder' # aug images will be saved here
    os.makedirs(aug_image_folder,exist_ok=True)
    
    
    # define your augmentations
    aug = ImageDataGenerator(
        rotation_range=20,
        zoom_range=0.15,
        width_shift_range=0.2,
        height_shift_range=0.2,
        shear_range=0.15,
        horizontal_flip=True,
        fill_mode="nearest")
    
    
    # iterate over the image folder
    image_names = os.listdir(image_folder)
    for image_name in image_names:
        img = cv2.imread(f'{image_folder}/{image_name}',1)
    
        # get the augmented image
        aug_img_iterator = aug.flow(x=np.expand_dims(img,0),batch_size=1)
        aug_img=next(aug_img_iterator)
    
        # save the augmented image
        cv2.imwrite(f'{aug_image_folder}/{image_name}',aug_img[0,:,:,:])
    
    print(f'aug images: {os.listdir(aug_image_folder)}')
    
    my images: ['img_0.png', 'img_2.png', 'img_1.png']
    aug images: ['img_0.png', 'img_2.png', 'img_1.png']