optimizationneural-networkdeep-learninggradient-descentcost-based-optimizer

After Calculation the gradients of my paramter w and u, what is the next step to optimize them in a SGD way?


What I m coding: I m build an easy neural network with a weight matrix w and a second paramter u for the score. After multipling my input vector with w, the result is multiplied with a vector u to get a result as one figure and that is my score.

Where I m right now: I calculated the gradients of both two paramter with respect to my loss function.

My problem: And now i m stuck what to do next?

My solution proposal: Can I update the paramter with w = w + learingrate * w_grad (and also for u with u = u learning rate *u_grad) and do this procedure until my cost / loss value decrease... does this work? Is this correct? Is this an esay implementation of Stochastic Gradient Descent?

I m coding in Java, if you have an easy and good documented example how to optimize a neural net in an easy way, you can share it with me.

Thanks in advance!


Solution

  • I suppose that w_grad is partial derivatives. If to speak of what your solution proposal it is something that is called iterative way of optimization. Just one clarification. Instead of w = w + learingrate * w_grad you should use w = w - learingrate * w_grad. It works fine, but if you have multicore machine, it will use only one core. If you need performance boost you can try batch algorithm. w = w - learingrate * Summ(w_grad). Performance boost is achieved during w_grad calculation