chainerchainercv

How to accumulate gradient across mini-batch and then back-propagation in Chainer?


I am doing classifying video sequence, I need 2 things:

  1. Because of limited GPU memory, I want to accumulate gradient across mini-batch, and then average gradient value, and then back propagation.

  2. I need to know how to shuffle between mini-batch but not shuffle inside each mini-batch, because I want the video sequence keep its order.


Solution

  • Question 1: You can forward and backward each minibatch but not call optimizer.update(), after you have repeated forward & backward for necessary minibatches, you can call optimizer.update() to updated based on accumulated gradients.

    If you want to achieve it with trainer module, I think you need to override StandardUpdater to define your own Updater class to do above.

    Question 2: Are you using trainer module? If so, you can define your own iterator to achieve this. See also below for reference how to define iterator class.