tensorflowkerasbackendopenmaxcifar100

keras.backend.function() won't accept model.layers[0].input as input tensor


I'm trying to use this tensorflow implementation of openmax and adapt it to CIFAR-100 to use it in my project. As part of openmax instead of softmax you have to get the activation vector of the penultimate layer, for which the following function is used:

def get_activations(model, layer, X_batch):
    get_activations = K.function(
        [model.layers[0].input, K.learning_phase()],
        [model.layers[layer].output])
    
    activations = get_activations([DataGenerator(X_batch, mode='predict', batch_size=8, augment=False, shuffle=False), 0])[0]
    # print (activations.shape)
    return activations

Function inputs: layer is the layer number (penultimate layer: -2) and X_batch is a batch of input images (shape & type: (597, 32, 32, 3), <class 'numpy.ndarray'>) The model looks like this:

from tensorflow.keras.layers import Input
from efficientnet.tfkeras import EfficientNetB0

height = 224
width = 224
channels = 3
n_classes = 20
input_shape = (height, width, channels)

# Define the input layer using the functional API
inputs = Input(shape=input_shape, name='input_layer')

# Create the EfficientNetB0 model
efnb0 = EfficientNetB0(weights='imagenet', include_top=False, input_shape=input_shape)

# Add the remaining layers
x = efnb0(inputs)
x = GlobalAveragePooling2D()(x)
x = Dropout(0.5)(x)
x = Dense(n_classes, activation='relu')(x)
outputs = Dense(n_classes, activation='softmax')(x)

# Define the model
model = tf.keras.models.Model(inputs=inputs, outputs=outputs)

model.summary()
Model: "model_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_layer (InputLayer)    [(None, 224, 224, 3)]     0         
                                                                 
 efficientnet-b0 (Functional  (None, 7, 7, 1280)       4049564   
 )                                                               
                                                                 
 global_average_pooling2d (G  (None, 1280)             0         
 lobalAveragePooling2D)                                          
                                                                 
 dropout (Dropout)           (None, 1280)              0         
                                                                 
 dense (Dense)               (None, 20)                25620     
                                                                 
 dense_1 (Dense)             (None, 20)                420       
                                                                 
=================================================================
Total params: 4,075,604
Trainable params: 4,033,588
Non-trainable params: 42,016
_________________________________________________________________
None

The datagenerator is used to resize input images to (224, 224, 3) shape faster. It probably has nothing to do with input shape though as it already throws an error at K.function(...). Here's the complete traceback:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[29], line 74
     72 data = X_train, X_test, y_train, y_test
     73 #create_model(models[child_id], data)
---> 74 create_model(functional_model, data)
     75 for i in range(5):
     76     random_char = np.random.randint(0,len(X_test))

Cell In[28], line 254, in create_model(model, data)
    252 print(f'i, sep_x[i]: {i}, {sep_x[i].shape}')
    253 weibull_model[label[i]] = {}
--> 254 score, fc8 = compute_feature(sep_x[i], model)
    255 mean = compute_mean_vector(fc8)
    256 distance = compute_distances(mean, fc8, sep_y)

Cell In[28], line 131, in compute_feature(x, model)
    128 def compute_feature(x, model):
    129     # output = models[i].layers[-2].output         # define output layer, [-1] is last layer
    130     print(f'shape of sep_x: {x.shape}, {type(x)}')
--> 131     score = get_activations(model, -1, x)
    132     fc8 = get_activations(model, -2, x)
    133     return score, fc8

Cell In[28], line 208, in get_activations(model, layer, X_batch)
    205 def get_activations(model, layer, X_batch):
    206     # print (model.layers[6].output)
    207     X_batch_tensor = tf.convert_to_tensor(X_batch)
--> 208     get_activations = K.function(
    209         [model.layers[0].input, K.learning_phase()],
    210         [model.layers[layer].output])
    212     activations = get_activations([DataGenerator(X_batch, mode='predict', batch_size=8, augment=False, shuffle=False), 0])[0]
    213     # print (activations.shape)

File ~\Python310\lib\site-packages\keras\backend.py:4625, in function(inputs, outputs, updates, name, **kwargs)
   4622 from keras import models
   4623 from keras.utils import tf_utils
-> 4625 model = models.Model(inputs=inputs, outputs=outputs)
   4627 wrap_outputs = isinstance(outputs, list) and len(outputs) == 1
   4629 def func(model_inputs):

File ~\Python310\lib\site-packages\tensorflow\python\trackable\base.py:205, in no_automatic_dependency_tracking.<locals>._method_wrapper(self, *args, **kwargs)
    203 self._self_setattr_tracking = False  # pylint: disable=protected-access
    204 try:
--> 205   result = method(self, *args, **kwargs)
    206 finally:
    207   self._self_setattr_tracking = previous_value  # pylint: disable=protected-access

File ~\Python310\lib\site-packages\keras\engine\functional.py:157, in Functional.__init__(self, inputs, outputs, name, trainable, **kwargs)
    151 # Check if the inputs contain any intermediate `KerasTensor` (not
    152 # created by tf.keras.Input()). In this case we need to clone the `Node`
    153 # and `KerasTensor` objects to mimic rebuilding a new model from new
    154 # inputs.  This feature is only enabled in TF2 not in v1 graph mode.
    155 if tf.compat.v1.executing_eagerly_outside_functions():
    156     if not all(
--> 157         [
    158             functional_utils.is_input_keras_tensor(t)
    159             for t in tf.nest.flatten(inputs)
    160         ]
    161     ):
    162         inputs, outputs = functional_utils.clone_graph_nodes(
    163             inputs, outputs
    164         )
    165 self._init_graph_network(inputs, outputs)

File ~\Python310\lib\site-packages\keras\engine\functional.py:158, in <listcomp>(.0)
    151 # Check if the inputs contain any intermediate `KerasTensor` (not
    152 # created by tf.keras.Input()). In this case we need to clone the `Node`
    153 # and `KerasTensor` objects to mimic rebuilding a new model from new
    154 # inputs.  This feature is only enabled in TF2 not in v1 graph mode.
    155 if tf.compat.v1.executing_eagerly_outside_functions():
    156     if not all(
    157         [
--> 158             functional_utils.is_input_keras_tensor(t)
    159             for t in tf.nest.flatten(inputs)
    160         ]
    161     ):
    162         inputs, outputs = functional_utils.clone_graph_nodes(
    163             inputs, outputs
    164         )
    165 self._init_graph_network(inputs, outputs)

File ~\Python310\lib\site-packages\keras\engine\functional_utils.py:48, in is_input_keras_tensor(tensor)
     32 """Check if tensor is directly generated from `tf.keras.Input`.
     33 
     34 This check is useful when constructing the functional model, since we will
   (...)
     45   ValueError: if the tensor is not a KerasTensor instance.
     46 """
     47 if not node_module.is_keras_tensor(tensor):
---> 48     raise ValueError(_KERAS_TENSOR_TYPE_CHECK_ERROR_MSG.format(tensor))
     49 return tensor.node.is_input

ValueError: Found unexpected instance while processing input tensors for keras functional model. Expecting KerasTensor which is from tf.keras.Input() or output from keras layer call(). Got: 0

I thought backend keras function doesn't like something about the input layer of efficientnetb0 and I trained the model using the functional API of keras again as can be seen above and nothing changed (I have used the sequential API originally). I also tried updating keras and tensorflow to the latest version but still no luck. If you need anything else from the code let me know. I didn't post everything since it's a relatively large chunk of code.


Solution

  • This isn't an actual answer to the problem and doesn't explain why the error occurs, but I've found a workaround. According to fchollet's comment The easiest way to get intermediate results of a hidden layer in a network is to get the weights of that layer after training, build a new model with it and predict your batch of images with that new model. I also tried to write a similar function as above with theano, but that too has all sorts of import issues. Here's the full answer:

    One simple way to do it is to use the weights of your model to build a new model that's truncated at the layer you want to read. Then you can run the .predict(X_batch) method to get the activations for a batch of inputs.

    Example:

    # this is your initial model
    model = Sequential()
    model.add(Dense(20, 64, init='uniform'))
    model.add(Activation('tanh'))
    model.add(Dense(64, 1, init='uniform'))
    model.add(Activation('softmax'))
    
    # we train it
    model.compile(loss='mse', optimizer='sgd')
    model.fit(X_train, y_train, nb_epoch=20, batch_size=16)
    
    # we build a new model with the activations of the old model
    # this model is truncated after the first layer
    model2 = Sequential()
    model2.add(Dense(20, 64, weights=model.layers[0].get_weights()))
    model2.add(Activation('tanh'))
    
    activations = model2.predict(X_batch)
    

    Note: I haven't tested it.

    Another way to do it would be to define a Theano function to get the layer's output:

    import theano
    get_activations = theano.function([model.layers[0].input], model.layers[1].output(train=False), allow_input_downcast=True)
    activations = get_activations(X_batch) # same result as above
    

    Note: also untested.