So I've been playing around with this code: https://www.tensorflow.org/tutorials/generative/dcgan and have almost developed a good idea about its functioning. However, I can't quite discover what is the BUFFER_SIZE variable's use. I suspect that it may be used to create a subset of the database of size BUFFER_SIZE and then the batches are taken from this subset, but I don't see the point on it and neither can find someone explaining it.
So, if someone could explain me what BUFFER_SIZE does, I would be thankful ❤
It's used as the buffer_size
argument in tf.data.Dataset.shuffle
. Have you read the docs?
This dataset fills a buffer with
buffer_size
elements, then randomly samples elements from this buffer, replacing the selected elements with new elements. For perfect shuffling, a buffer size greater than or equal to the full size of the dataset is required.
For instance, if your dataset contains 10,000 elements but
buffer_size
is set to 1,000, then shuffle will initially select a random element from only the first 1,000 elements in the buffer. Once an element is selected, its space in the buffer is replaced by the next (i.e. 1,001-st) element, maintaining the 1,000 element buffer.