I know a neural network can be trained using gradient descent and I understand how it works.
Recently, I stumbled upon other training algorithms: conjugate gradient and quasi-Newton algorithms. I tried to understand how they work but the only good intuition I could get is that they use higher order derivative.
Are those alternative algorithms I mentioned fundamentally different from a backpropagation process where weights are adjusted by using the gradient of the loss function?
If not, is there an algorithm to train a neural network that is fundamentally different from the mechanism of backpropagation?
Conjugate gradient and quasi-Newton algorithms are still gradient descent algorithms. Backpropagation (or backprop) is nothing more than a fancy name to a gradient computation.
However, the original question of alternatives to backprop is very important. One of the recent alternatives, for example, is equilibrium propagation (or shortly eqprop).
Edit 28/02/2024: Since the question is quite general ("an algorithm to train a neural network"), here are a few more techniques to include: