I am trying to use the cross validation approach for the model I use for classifying images into 3 classes. I use the following code to import images:
train_datagen = ImageDataGenerator(rescale=1./255)
data = train_datagen.flow_from_directory(directory=train_path,
target_size=(300,205), batch_size=8,
color_mode='grayscale',class_mode='categorical')
It worked fine to train the model and test it before I tried using sklearn.model_selection
's KFold
. All the examples I find on the internet are simple numpy arrays, whereas I have a classification array. Meaning that the arrays of images have labels and I could not work anything around to convert this DirectoryIterator
(flow_from_directory returns a DirectoryIterator) into an array that can be used with kfold.split
function.
I tried the following approaches, please bear in mind I am new to classification models:
np_data = data.next()
num_folds = 5
kfold = KFold(n_splits=num_folds, shuffle=True)
for train, test in kfold.split(np_data):
Then I get: ValueError: Cannot have number of splits n_splits=5 greater than the number of samples: n_samples=2.
I believe I get this value error because np_array
has 2 nested arrays inside, first for the images and second for their classes.
I would try to shuffle and kfold only the images, but then without the information what class they belong to I cannot train my model properly. I have tried following the guide in this link but the data for their testing and training seem to be imported in a different way than I have my data. Then I came across also this, but again it did not really help with my situation.
I have no idea what I am missing, any additional help will be much appreciated.
Lastly I have tried doing:
x, y = data.next()
for train, test in kfold.split(x, y):
...
This gives me the following error when it begins the first epoch of the first fold:
ValueError: No gradients provided for any variable: ['conv2d/kernel:0', 'conv2d/bias:0', 'conv2d_1/kernel:0', 'conv2d_1/bias:0', 'conv2d_2/kernel:0', 'conv2d_2/bias:0', 'conv2d_3/kernel:0', 'conv2d_3/bias:0', 'dense/kernel:0', 'dense/bias:0', 'dense_1/kernel:0', 'dense_1/bias:0'].
The reason I got the last ValueError was because I did not include y[test]
when I used model.fit()
. The following worked fine for me.
After importing the images with ImageDataGenerator.flow_from_directory(...)
, x, y = data.next()
yields images and their label into x and y arrays. Henceforth:
kfold = KFold(n_splits=num_folds, shuffle=True)
fold_no = 1
for train, test in kfold.split(x, y):
model = keras.models.Sequential(.....)
model.fit(x[train], y[train], epochs=epochs)
...
scores = model.evaluate(x[test], y[test], verbose=0)
...
fold_no = fold_no + 1
I also used this print line to keep track of the scores:
print(f'Score for fold {fold_no}: {network.metrics_names[0]} of {scores[0]}; {network.metrics_names[1]} of {scores[1]*100}%')
Additionally, loss and accuracy results can be stored in two separate arrays and get an average at the end of the folds.
acc_per_fold.append(scores[1] * 100)
loss_per_fold.append(scores[0])
The above 2 lines have to be inside the for loop (for train, test in kfold.split(x, y):
), and the below lines outside of it.
print("\n\n Overall accuracy: " + str(np.average(acc_per_fold)))
print("Overall loss: " + str(np.average(loss_per_fold)))