machine-learningkerasepochunsupervised-learningadam

How dose the setting of steps_per_epoch and epochs affect the training result in Keras?


My generator always yields two images from my dataset randomly and then I calculate the loss using this two samples. Say I set steps_per_epoch=40 and epochs=5, what's the difference if I set steps_per_epoch=5 and epochs=40 (I use Adam for my optimizer)?


Solution

  • The epochs argument (also called iteration) refers to the number of full passes over the whole training data. The steps_per_epoch argument refers to the number of batches generated during one epoch. Therefore we have steps_per_epoch = n_samples / batch_size.

    For example, if we have 1000 training samples and we set batch-size to 10 then we have steps_per_epoch = 1000 / 10 = 100. The epochs can be set regardless of the value of batch-size or steps_per_epoch.

    There is no definite value of batch-size that works for all the scenarios. Usually, a very large batch-size slows down the training process (i.e. it takes more time for the model to converge to a solution) and a very small batch-size may not be a good use of available resources (i.e. GPU and CPU). The usual values include 32, 64, 128, 256, 512 (powers of 2 helps with faster GPU memory allocations). Also, here is an answer on SO that concerns this issue which includes citations of relevant books and papers. Or take a look at this question and its answers on Cross Validated for a more complete definition of batch-size.