tensorflowimage-processingimportdeep-learningimagedata

How to import image dataset from folder with tensorflowio


Hi I have image dataset from sharing folder. The dataset path like this: /media/sharing_folder/data and data folder has two subfolder which are "masked", "unmasked". I try to import data like this:

data = []

def create_data():
    for category in CATEGORIES:
        path =  os.path.join(DATADIR, category) #path to masked or unmasked dir
        class_num = CATEGORIES.index(category)
        for img in os.listdir(path):
            try:
                img_array = cv2.imread(os.path.join(path,img), cv2.IMREAD_GRAYSCALE)
                new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
                data.append([new_array, class_num])
            except Exception as e:
                pass
            
create_data()

However, this import is very slowly run. I want to import data with tensorflowio. How can I import with tensorflowio?


Solution

  • To show your code properly start a new line. On your keyboard depress the key in the upper left corner (the key just to the right of the 1 key) four times. Then create a new line and enter your code. When finished with entering your code repeat pressing the same key referenced above four times. This closes out the code area. Here is your code

    data = []
    
    def create_data(): 
        for category in CATEGORIES: 
            path = os.path.join(DATADIR, category) #path to masked or unmasked dir 
            class_num = CATEGORIES.index(category) 
            for img in os.listdir(path): 
                try: img_array = cv2.imread(os.path.join(path,img),cv2.IMREAD_GRAYSCALE) 
                    new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE)) 
                    data.append([new_array, class_num]) 
                except Exception as e: 
                    pass
    
    create_data()
    

    Now to answer your question I recommend you use the keras ImageDataGenerator. Documentation is here.

    split=.2 # set percentage of images to use for validation
    height=254 # set to desired image height
    width =254 # set to desired image width
    batch_size=32 # set to desired batch size
    seed = 123  # set to an arbitrary value
    data_dir=r'c: /media/sharing_folder/data'
    img_gen=tf.keras.preprocessing.image.ImageDataGenerator(rescale=1/255,validation_split=split) 
    train_gen=img_gen.flow_from_directory(directory= data_dir,target_size=(height,width),
              color_mode='grayscale', class_mode=categorical, batch_size=batch_size,
              seed=seed, shuffle=True, subset='training)
    valid_gen=img_gen.flow_from_directory(directory= data_dir,target_size=(height,width),
              color_mode='grayscale', class_mode=categorical, batch_size=batch_size,
              seed=seed, shuffle=False, subset='validation') 
    

    then compile your model and use loss as categorical_crossentropy. Then use model.fit to train your model using the generators defined above With respect to validation it is best to set shuffle=False in the valid_gen so validation images are provided in the same order for each epoch.