neural-networkmini-batch

Mini Batching Neural Network


I'm trying to implement mini batching correctly for my own NN.

But I can't wrap my head about what's being summed? Do I sum the Gradient or the delta weights (where the learning rate is already multiplied) for the weight and bias which in my example are:

Delta Weight: activation'(neurons) ⊗ Error * learningRate x input

Delta Bias: activation'(neurons) ⊗ Error * learningRate

Do I also divide those summed delta weights or gradients throug the batch size?

EDIT:

So all questions summed:


Solution

  • After researching for a whole night and looking at lots of blogs / articles I came to these answers (which work for me!)

    1) Nevermind, people call both the "gradient"

    2) without the learning rate

    3) Yes, when finishing the batch you multiply the learning rate (... and do momentum optimization if implemented)