pythontensorflowkerasartificial-intelligence

Could validation data be a generator in tensorflow.keras 2.0?


In official documents of tensorflow.keras,

validation_data could be: tuple (x_val, y_val) of Numpy arrays or tensors tuple (x_val, y_val, val_sample_weights) of Numpy arrays dataset For the first two cases, batch_size must be provided. For the last case, validation_steps could be provided.

It does not mention if generator could act as validation_data. So I want to know if validation_data could be a datagenerator? like the following codes:

net.fit_generator(train_it.generator(), epoch_iterations * batch_size, nb_epoch=nb_epoch, verbose=1,
                  validation_data=val_it.generator(), nb_val_samples=3,
                  callbacks=[checker, tb, stopper, saver])

Update: In the official documents of keras, the same contents, but another sentense is added:

  • dataset or a dataset iterator

Considering that

dataset For the first two cases, batch_size must be provided. For the last case, validation_steps could be provided.

I think there should be 3 cases. Keras' documents are correct. So I will post an issue in tensorflow.keras to update the documents.


Solution

  • Yes it can, that's strange that it is not in the doc but is it working exactly like the x argument, you can also use a keras.Sequence or a generator. In my project I often use keras.Sequence that acts like a generator

    Minimum working example that shows that it works :

    import numpy as np
    from tensorflow.keras import Sequential
    from tensorflow.keras.layers import Dense, Flatten
    
    def generator(batch_size): # Create empty arrays to contain batch of features and labels
        batch_features = np.zeros((batch_size, 1000))
        batch_labels = np.zeros((batch_size,1))
        while True:
            for i in range(batch_size):
                yield batch_features, batch_labels
    
    model = Sequential()
    model.add(Dense(125, input_shape=(1000,), activation='relu'))
    model.add(Dense(8, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    
    train_generator = generator(64)
    validation_generator = generator(64)
    
    model.fit(train_generator, validation_data=validation_generator, validation_steps=100, epochs=100, steps_per_epoch=100)
    

    100/100 [==============================] - 1s 13ms/step - loss: 0.6689 - accuracy: 1.0000 - val_loss: 0.6448 - val_accuracy: 1.0000 Epoch 2/100 100/100 [==============================] - 0s 4ms/step - loss: 0.6223 - accuracy: 1.0000 - val_loss: 0.6000 - val_accuracy: 1.0000 Epoch 3/100 100/100 [==============================] - 0s 4ms/step - loss: 0.5792 - accuracy: 1.0000 - val_loss: 0.5586 - val_accuracy: 1.0000 Epoch 4/100 100/100 [==============================] - 0s 4ms/step - loss: 0.5393 - accuracy: 1.0000 - val_loss: 0.5203 - val_accuracy: 1.0000