Pretty new with data generator and dataset from tensorflow. I struggle with sizing batch, epochs and step... I can't figure the good set up to remove error "Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence"
I try using size of a chunck of my data called by data generator and try with complete size of all my dataset, and size of splited dataset but no one seem to work.
Here a simplify code of my last try
def data_generator(df, chunk_size):
total_number_sample = 10000
for start_idx in range(1, total_number_sample , chunk_size):
end_idx = start_idx + chunk_size-1
df_subset = df.where(col('idx').between(start_idx, end_idx))
feature = np.array(df_subset.select("vector_features_scaled").rdd.map(lambda row: row[0].toArray()).collect())
label = df_subset.select("ptype_s_l_m_v").toPandas().values.flatten()
yield feature, label
dataset = tf.data.Dataset.from_generator(
lambda: data_generator(df, chunk_size),
output_signature=(
tf.TensorSpec(shape=(None, 24), dtype=tf.float32),
tf.TensorSpec(shape=(None, 4), dtype=tf.float32)
))
I split and batch my data this way for trainning/validation
batch_sz = 100
split_ratio = .9
split_size = math.floor((chunk_size*10) * split_ratio)
train_dataset = dataset.take(split_size).batch(batch_sz)
train_dataset = train_dataset.prefetch(tf.data.experimental.AUTOTUNE)
test_dataset = dataset.skip(split_size).batch(batch_sz)
test_dataset = test_dataset.prefetch(tf.data.experimental.AUTOTUNE)
steps_per_epoch=math.ceil(10000 * split_ratio) / batch_sz)
validation_steps=math.ceil((10000-split_size)) / batch_sz)
model.fit(train_dataset,
steps_per_epoch=steps_per_epoch,
epochs=3,
validation_data=test_dataset,
validation_steps=validation_steps,
verbose=2)
results = model.evaluate(dataset.batch(batch_sz))
without batching all work great (model.fit() and model.evaluate())
but when I use batch I got this error:
W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node IteratorGetNext}}]]
/usr/lib/python3.11/contextlib.py:155: UserWarning: Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches. You may need to use the `.repeat()` function when building your dataset.
self.gen.throw(typ, value, traceback)
I see lot of tread about steps_per_epoch epoch and batch size but I'm not finding a solution while apply on splitted data.
I finnaly find the problem.
tensorflow dataset are a kind of data generator so we don't need to use a data generator to chunck dataset and pass it to tensorflow dataset.
Use .batch() to generated "chunck" of data to read by itteration.