I am doing classifying video sequence, I need 2 things:
Because of limited GPU memory, I want to accumulate gradient across mini-batch, and then average gradient value, and then back propagation.
I need to know how to shuffle between mini-batch but not shuffle inside each mini-batch, because I want the video sequence keep its order.
Question 1: You can forward and backward each minibatch but not call optimizer.update(), after you have repeated forward & backward for necessary minibatches, you can call optimizer.update() to updated based on accumulated gradients.
If you want to achieve it with trainer
module, I think you need to override StandardUpdater
to define your own Updater
class to do above.
Question 2:
Are you using trainer
module?
If so, you can define your own iterator to achieve this. See also below for reference how to define iterator class.