What I m coding: I m build an easy neural network with a weight matrix w and a second paramter u for the score. After multipling my input vector with w, the result is multiplied with a vector u to get a result as one figure and that is my score.
Where I m right now: I calculated the gradients of both two paramter with respect to my loss function.
My problem: And now i m stuck what to do next?
My solution proposal: Can I update the paramter with w = w + learingrate * w_grad (and also for u with u = u learning rate *u_grad) and do this procedure until my cost / loss value decrease... does this work? Is this correct? Is this an esay implementation of Stochastic Gradient Descent?
I m coding in Java, if you have an easy and good documented example how to optimize a neural net in an easy way, you can share it with me.
Thanks in advance!
I suppose that w_grad is partial derivatives. If to speak of what your solution proposal it is something that is called iterative way of optimization. Just one clarification. Instead of w = w + learingrate * w_grad you should use w = w - learingrate * w_grad. It works fine, but if you have multicore machine, it will use only one core. If you need performance boost you can try batch algorithm. w = w - learingrate * Summ(w_grad). Performance boost is achieved during w_grad calculation