Following is my csv file
file,pt1,pt2,pt3,,pt4,pt5,pt6
object/obj0.png,66.0335639098,39.0022736842,30.2270075188,36.4216781955,59.582075188,39.6474225564
object/obj0.png,66.0335639098,39.0022736842,30.2270075188,36.4216781955,59.582075188,39.6474225564
object/obj0.png,66.0335639098,39.0022736842,30.2270075188,36.4216781955,59.582075188,39.6474225564
How do i load those images and the annotations to train my simple cnn?
I tried using 'ImagedataGenerator
' as follows but it didnt help...is there any other alternative?
train_datagen = ImageDataGenerator(
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
The ImageDataGenerator
object allows to yield data either from numpy arrays
or directly from directories. In the latter case, the labels are automatically inferred from the folder structure of your data: each class of images should live in a separate folder. Whenever the label structure is more complex, as in your case, you can opt to write you own custom generator. If you do so, make use of Keras' Sequence
object, which allows for safe multiprocessing. The Keras website contains a boilerplate example. In your case, your code would look something like this:
from keras.utils import Sequence
from keras.preprocessing.image import load_img
import pandas as pd
import random
class DataSequence(Sequence):
def __init__(self, csv_path, batch_size, mode='train'):
self.df = pd.read_csv(csv_path) # read your csv file with pandas
self.bsz = batch_size # batch size
self.mode = mode # shuffle when in train mode
# Take labels and a list of image locations in memory
self.labels = self.df[['pt1', 'pt2', 'pt3', 'pt4', 'pt5', 'pt6']].values
self.im_list = self.df['file'].tolist()
def __len__(self):
# compute number of batches to yield
return int(math.ceil(len(self.df) / float(self.bsz)))
def on_epoch_end(self):
# Shuffles indexes after each epoch if in training mode
self.indexes = range(len(self.im_list))
if self.mode == 'train':
self.indexes = random.sample(self.indexes, k=len(self.indexes))
def get_batch_labels(self, idx):
# Fetch a batch of labels
return self.labels[idx * self.bsz: (idx + 1) * self.bsz,:]
def get_batch_features(self, idx):
# Fetch a batch of inputs
return np.array([load_img(im) for im in self.im_list[idx * self.bsz: (1 + idx) * self.bsz]])
def __getitem__(self, idx):
batch_x = self.get_batch_features(idx)
batch_y = self.get_batch_labels(idx)
return batch_x, batch_y
You can use this Sequence
object to train your model with model.fit_generator()
:
sequence = DataSequence('./path_to/csv_file.csv', batch_size)
model.fit_generator(sequence, epochs=1, use_multiprocessing=True)
See also this related question.