I intend to create mini-batches for my deep learning neural network program, from a training set consisting 'm' number of examples. I have tried:
# First Shuffle (X, Y)
permutation = list(np.random.permutation(m))
shuffled_X = X[:, permutation]
shuffled_Y = Y[:, permutation].reshape((1,m))
# Partition (shuffled_X, shuffled_Y). Minus the end case where mini-batch will contain lesser number of training samples.
num_complete_minibatches = math.floor(m/mini_batch_size) # number of mini batches of size mini_batch_size in your partitionning
for k in range(0, num_complete_minibatches):
### START CODE HERE ### (approx. 2 lines)
mini_batch_X = shuffled_X[mini_batch_size*k:mini_batch_size*(k+2)]
mini_batch_Y = shuffled_Y[mini_batch_size*k:mini_batch_size*(k+2)]
But this is giving me following results:
shape of the 1st mini_batch_X: (128, 148)
shape of the 2nd mini_batch_X: (128, 148)
shape of the 3rd mini_batch_X: (12288, 148)
shape of the 1st mini_batch_Y: (1, 148)
shape of the 2nd mini_batch_Y: (0, 148)
shape of the 3rd mini_batch_Y: (1, 148)
mini batch sanity check: [ 0.90085595 -0.7612069 0.2344157 ]
The expected output is:
shape of the 1st mini_batch_X (12288, 64)
shape of the 2nd mini_batch_X (12288, 64)
shape of the 3rd mini_batch_X (12288, 20)
shape of the 1st mini_batch_Y (1, 64)
shape of the 2nd mini_batch_Y (1, 64)
shape of the 3rd mini_batch_Y (1, 20)
mini batch sanity check [ 0.90085595 -0.7612069 0.2344157 ]
I'm sure there is something wrong with slicing that I have implemented but can't to figure it out. Any help is much appreciated. Thanks!
I think you are not slicing numpy arrays properly. Initially when you were shuffling the arrays that way was correct. You don't want to slice the first dimension so keep it as it is using :
and slice the second dimension using <Start Index>:<End Index>
. This is what I'm doing in the code below.
for k in range(num_complete_minibatches+1):
### START CODE HERE ### (approx. 2 lines)
mini_batch_X = shuffled_X[:,mini_batch_size*(k):mini_batch_size*(k+1)]
mini_batch_Y = shuffled_Y[:,mini_batch_size*(k):mini_batch_size*(k+1)]
print(mini_batch_X.shape,mini_batch_Y.shape)