pythontensorflowtensorflow2.0eager

Train complicated nn models with tf.eager (better with TF2 symbolic support)


Is there (more or less) simple way to write a complicated NN model so it will be trainable in the eager mode? Are there examples of a such code?

For example, I want to use the InceptionResnetV2. I have the code created with tf.contrib.slim. According to this link, https://github.com/tensorflow/tensorflow/issues/16182 , slim is deprecated and I need to use Keras. And I really can't use a slim code for training with eager because I can't take the list of variables and apply gradients (ok, I can try to wrap model into the GradientTape but not sure what to do with regularization loss).

Ok, let's try Keras.

In [30]: tf.__version__                                                                                                                                                                          
Out[30]: '1.13.1'

In [31]: tf.enable_eager_execution()

In [32]: from keras.applications.inception_resnet_v2 import InceptionResNetV2

In [33]: model = InceptionResNetV2(weights=None)
...
/usr/local/lib/python3.6/dist-packages/keras_applications/inception_resnet_v2.py in InceptionResNetV2(include_top, weights, input_tensor, input_shape, pooling, classes, **kwargs)
    246 
    247     if input_tensor is None:
--> 248         img_input = layers.Input(shape=input_shape)
    249     else:
    250         if not backend.is_keras_tensor(input_tensor):
...
RuntimeError: tf.placeholder() is not compatible with eager execution.

Doesn't work by default.

In this tutorial they say that I need to make my own class of a model and maintain variables by myself https://www.tensorflow.org/tutorials/eager/custom_training#define_the_model. I'm not sure that I want to do it for Inception. Too many variables to create and maintain. It's like returning back to old versions of TF, in days when even slim didn't exist.

In this tutorial networks are created using Keras https://www.tensorflow.org/tutorials/eager/custom_training_walkthrough#create_a_model_using_keras but I doubt that I can easily maintain complicated structure in a such way, by only defining model without using it with Input. For example, in this article, if I understand correctly, author initialize keras Input and propagate it through the model (which causes RuntimeError when used with Eager, as you seen earlier). I can make my own model by subclassing the model class as here: https://www.tensorflow.org/api_docs/python/tf/keras/Model . Oops, in this way I need to maintain layers, not variables. It seems as almost the same problem to me.

There is an interesting mention of AutoGrad here https://www.tensorflow.org/beta/guide/autograph#keras_and_autograph . They only overwrite __call__, so it seems like I don't need to maintain variables in this case, but I didn't test it yet.


So, is there any simple solution?

Wrap slim model in GradientTape? How can I then apply reg loss to the weights?

Track every variable by myself? Sounds a little bit painful.

Use Keras? How to use it with eager when I have branches and complicated structure in the model?


Solution

  • Your first approach is probably the most common. This error:

    RuntimeError: tf.placeholder() is not compatible with eager execution.

    is because one cannot use a tf.placeholder in eager mode. There is no concept of such a thing when executing eagerly.

    You could use the tf.data API to build a dataset for your training data and feed that to the model. Something like this with the datasets replaced with your real data:

    import tensorflow as tf
    tf.enable_eager_execution()
    
    model = tf.keras.applications.inception_resnet_v2.InceptionResNetV2(weights=None)
    
    model.compile(tf.keras.optimizers.Adam(), loss=tf.keras.losses.categorical_crossentropy)
    
    ### Replace with tf.data.Datasets for your actual training data!
    train_x = tf.data.Dataset.from_tensor_slices(tf.random.normal((10,299,299,3)))
    train_y = tf.data.Dataset.from_tensor_slices(tf.random.uniform((10,), maxval=10, dtype=tf.int32))
    training_data = tf.data.Dataset.zip((train_x, train_y)).batch(BATCH_SIZE)
    
    model.fit(training_data)
    
    

    This approach works as is in TensorFlow 2.0 too as mentioned in your title.