I am new to imblearn library. I have image dataset belongs to 5 categories,the dataset is highly unbalanced. I load images using tensorflow flow.from directory function and use smote function for resampling.
img_height, img_width = 224,224
# the no. imgaes to load at each iteration
batch_size = 32
# only rescaling
train_datagen = ImageDataGenerator(
rescale=1./255,
zoom_range=0.2,
horizontal_flip=True,
vertical_flip=True
)
test_datagen = ImageDataGenerator(
rescale=1./255,
vertical_flip=True,
zoom_range=0.2,
horizontal_flip=True
)
# these are generators for train/test data that will read pictures #found in the defined subfolders of 'data/'
print('Total number of images for "training":')
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size = (img_height, img_width),
batch_size = batch_size,
class_mode = "categorical",shuffle = True
#,color_mode='grayscale'
)
smote = SMOTE()
X_sm, y_sm = smote.fit_resample(train_generator, category_names)
the cell start to run and after 30 to 40 mins the jupyter kernel is dead and i got no results. Please help to solve this, i have 16 GB GPU but smote is not running on image dataset
you can perform data augmentation on imbalance categories
resize them into (28,28) or (32,32) and use flatten to convert into 784 or 1024 features now you can use SMOTE
hope it will work