Hi I have image dataset from sharing folder. The dataset path like this: /media/sharing_folder/data and data folder has two subfolder which are "masked", "unmasked". I try to import data like this:
data = []
def create_data():
for category in CATEGORIES:
path = os.path.join(DATADIR, category) #path to masked or unmasked dir
class_num = CATEGORIES.index(category)
for img in os.listdir(path):
try:
img_array = cv2.imread(os.path.join(path,img), cv2.IMREAD_GRAYSCALE)
new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
data.append([new_array, class_num])
except Exception as e:
pass
create_data()
However, this import is very slowly run. I want to import data with tensorflowio. How can I import with tensorflowio?
To show your code properly start a new line. On your keyboard depress the key in the upper left corner (the key just to the right of the 1 key) four times. Then create a new line and enter your code. When finished with entering your code repeat pressing the same key referenced above four times. This closes out the code area. Here is your code
data = []
def create_data():
for category in CATEGORIES:
path = os.path.join(DATADIR, category) #path to masked or unmasked dir
class_num = CATEGORIES.index(category)
for img in os.listdir(path):
try: img_array = cv2.imread(os.path.join(path,img),cv2.IMREAD_GRAYSCALE)
new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
data.append([new_array, class_num])
except Exception as e:
pass
create_data()
Now to answer your question I recommend you use the keras ImageDataGenerator. Documentation is here.
split=.2 # set percentage of images to use for validation
height=254 # set to desired image height
width =254 # set to desired image width
batch_size=32 # set to desired batch size
seed = 123 # set to an arbitrary value
data_dir=r'c: /media/sharing_folder/data'
img_gen=tf.keras.preprocessing.image.ImageDataGenerator(rescale=1/255,validation_split=split)
train_gen=img_gen.flow_from_directory(directory= data_dir,target_size=(height,width),
color_mode='grayscale', class_mode=categorical, batch_size=batch_size,
seed=seed, shuffle=True, subset='training)
valid_gen=img_gen.flow_from_directory(directory= data_dir,target_size=(height,width),
color_mode='grayscale', class_mode=categorical, batch_size=batch_size,
seed=seed, shuffle=False, subset='validation')
then compile your model and use loss as categorical_crossentropy. Then use model.fit to train your model using the generators defined above With respect to validation it is best to set shuffle=False in the valid_gen so validation images are provided in the same order for each epoch.