I want to do augmentation on a dataset containing images as an np-array stored in X_train and its label stored in y_train. Shapes are as follows:
print(X_train.shape)
print(y_train.shape)
Output:
(1100, 22, 64, 64)
(1100,)
A single image looks like this
plt.imshow(X_train[0][0])
How do I augment this dataset, so that I don't need to add its label every time?
One option is to use a generator:
def get_augmented_sample(X_train, y_train):
for x, y in zip(X_train, y_train):
# data augmentation to x, e.g. adding some noise
x_augmented = x + np.random.normal(0, 20, x.shape)
yield x_augmented, y
data_generator = get_augmented_sample(X_train, y_train)
# get an augmented sample
x, y = next(data_generator)
# original
plt.imshow(X_train[0][0])
# augmented
plt.imshow(x[0])