pythontensorflowdataseteager

TF.data.dataset.map(map_func) with Eager Mode


I am using TF 1.8 with eager mode enabled.

I cannot print the example inside the mapfunc. It when I run tf.executing_eagerly() from within the mapfunc I get "False"

import os
import tensorflow as tf
tf.logging.set_verbosity(tf.logging.ERROR)

tfe = tf.contrib.eager
tf.enable_eager_execution()
x = tf.random_uniform([16,10], -10, 0, tf.int64)
print(x)
DS = tf.data.Dataset.from_tensor_slices((x))


def mapfunc(ex, con):
    import pdb; pdb.set_trace()
    new_ex = ex + con
    print(new_ex) 
    return new_ex

DS = DS.map(lambda x: mapfunc(x, [7]))
DS = DS.make_one_shot_iterator()

print(DS.next())

print(new_ex) outputs:

Tensor("add:0", shape=(10,), dtype=int64)

Outside mapfunc, it works fine. But inside it, the passed example does not have a value, nor .numpy() attribute.


Solution

  • The tf.data transformations actually execute as a graph, so the body of the map function itself isn't executed eagerly. See #14732 for some more discussion on this.

    If you really need eager execution for the map function, you could use tf.contrib.eager.py_func, so something like:

    DS = DS.map(lambda x: tf.contrib.eager.py_func(
      mapfunc,
      [x, tf.constant(7, dtype=tf.int64)], tf.int64)
    # In TF 1.9+, the next line can be print(next(DS))
    print(DS.make_one_shot_iterator().next())
    

    Hope that helps.

    Note that by adding a py_func to the dataset, the single-threaded Python interpreter will be in the loop for every element produced.