I'm trying to use this tensorflow implementation of openmax and adapt it to CIFAR-100 to use it in my project. As part of openmax instead of softmax you have to get the activation vector of the penultimate layer, for which the following function is used:
def get_activations(model, layer, X_batch):
get_activations = K.function(
[model.layers[0].input, K.learning_phase()],
[model.layers[layer].output])
activations = get_activations([DataGenerator(X_batch, mode='predict', batch_size=8, augment=False, shuffle=False), 0])[0]
# print (activations.shape)
return activations
Function inputs: layer is the layer number (penultimate layer: -2) and X_batch is a batch of input images (shape & type: (597, 32, 32, 3), <class 'numpy.ndarray'>) The model looks like this:
from tensorflow.keras.layers import Input
from efficientnet.tfkeras import EfficientNetB0
height = 224
width = 224
channels = 3
n_classes = 20
input_shape = (height, width, channels)
# Define the input layer using the functional API
inputs = Input(shape=input_shape, name='input_layer')
# Create the EfficientNetB0 model
efnb0 = EfficientNetB0(weights='imagenet', include_top=False, input_shape=input_shape)
# Add the remaining layers
x = efnb0(inputs)
x = GlobalAveragePooling2D()(x)
x = Dropout(0.5)(x)
x = Dense(n_classes, activation='relu')(x)
outputs = Dense(n_classes, activation='softmax')(x)
# Define the model
model = tf.keras.models.Model(inputs=inputs, outputs=outputs)
model.summary()
Model: "model_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_layer (InputLayer) [(None, 224, 224, 3)] 0
efficientnet-b0 (Functional (None, 7, 7, 1280) 4049564
)
global_average_pooling2d (G (None, 1280) 0
lobalAveragePooling2D)
dropout (Dropout) (None, 1280) 0
dense (Dense) (None, 20) 25620
dense_1 (Dense) (None, 20) 420
=================================================================
Total params: 4,075,604
Trainable params: 4,033,588
Non-trainable params: 42,016
_________________________________________________________________
None
The datagenerator is used to resize input images to (224, 224, 3) shape faster. It probably has nothing to do with input shape though as it already throws an error at K.function(...). Here's the complete traceback:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[29], line 74
72 data = X_train, X_test, y_train, y_test
73 #create_model(models[child_id], data)
---> 74 create_model(functional_model, data)
75 for i in range(5):
76 random_char = np.random.randint(0,len(X_test))
Cell In[28], line 254, in create_model(model, data)
252 print(f'i, sep_x[i]: {i}, {sep_x[i].shape}')
253 weibull_model[label[i]] = {}
--> 254 score, fc8 = compute_feature(sep_x[i], model)
255 mean = compute_mean_vector(fc8)
256 distance = compute_distances(mean, fc8, sep_y)
Cell In[28], line 131, in compute_feature(x, model)
128 def compute_feature(x, model):
129 # output = models[i].layers[-2].output # define output layer, [-1] is last layer
130 print(f'shape of sep_x: {x.shape}, {type(x)}')
--> 131 score = get_activations(model, -1, x)
132 fc8 = get_activations(model, -2, x)
133 return score, fc8
Cell In[28], line 208, in get_activations(model, layer, X_batch)
205 def get_activations(model, layer, X_batch):
206 # print (model.layers[6].output)
207 X_batch_tensor = tf.convert_to_tensor(X_batch)
--> 208 get_activations = K.function(
209 [model.layers[0].input, K.learning_phase()],
210 [model.layers[layer].output])
212 activations = get_activations([DataGenerator(X_batch, mode='predict', batch_size=8, augment=False, shuffle=False), 0])[0]
213 # print (activations.shape)
File ~\Python310\lib\site-packages\keras\backend.py:4625, in function(inputs, outputs, updates, name, **kwargs)
4622 from keras import models
4623 from keras.utils import tf_utils
-> 4625 model = models.Model(inputs=inputs, outputs=outputs)
4627 wrap_outputs = isinstance(outputs, list) and len(outputs) == 1
4629 def func(model_inputs):
File ~\Python310\lib\site-packages\tensorflow\python\trackable\base.py:205, in no_automatic_dependency_tracking.<locals>._method_wrapper(self, *args, **kwargs)
203 self._self_setattr_tracking = False # pylint: disable=protected-access
204 try:
--> 205 result = method(self, *args, **kwargs)
206 finally:
207 self._self_setattr_tracking = previous_value # pylint: disable=protected-access
File ~\Python310\lib\site-packages\keras\engine\functional.py:157, in Functional.__init__(self, inputs, outputs, name, trainable, **kwargs)
151 # Check if the inputs contain any intermediate `KerasTensor` (not
152 # created by tf.keras.Input()). In this case we need to clone the `Node`
153 # and `KerasTensor` objects to mimic rebuilding a new model from new
154 # inputs. This feature is only enabled in TF2 not in v1 graph mode.
155 if tf.compat.v1.executing_eagerly_outside_functions():
156 if not all(
--> 157 [
158 functional_utils.is_input_keras_tensor(t)
159 for t in tf.nest.flatten(inputs)
160 ]
161 ):
162 inputs, outputs = functional_utils.clone_graph_nodes(
163 inputs, outputs
164 )
165 self._init_graph_network(inputs, outputs)
File ~\Python310\lib\site-packages\keras\engine\functional.py:158, in <listcomp>(.0)
151 # Check if the inputs contain any intermediate `KerasTensor` (not
152 # created by tf.keras.Input()). In this case we need to clone the `Node`
153 # and `KerasTensor` objects to mimic rebuilding a new model from new
154 # inputs. This feature is only enabled in TF2 not in v1 graph mode.
155 if tf.compat.v1.executing_eagerly_outside_functions():
156 if not all(
157 [
--> 158 functional_utils.is_input_keras_tensor(t)
159 for t in tf.nest.flatten(inputs)
160 ]
161 ):
162 inputs, outputs = functional_utils.clone_graph_nodes(
163 inputs, outputs
164 )
165 self._init_graph_network(inputs, outputs)
File ~\Python310\lib\site-packages\keras\engine\functional_utils.py:48, in is_input_keras_tensor(tensor)
32 """Check if tensor is directly generated from `tf.keras.Input`.
33
34 This check is useful when constructing the functional model, since we will
(...)
45 ValueError: if the tensor is not a KerasTensor instance.
46 """
47 if not node_module.is_keras_tensor(tensor):
---> 48 raise ValueError(_KERAS_TENSOR_TYPE_CHECK_ERROR_MSG.format(tensor))
49 return tensor.node.is_input
ValueError: Found unexpected instance while processing input tensors for keras functional model. Expecting KerasTensor which is from tf.keras.Input() or output from keras layer call(). Got: 0
I thought backend keras function doesn't like something about the input layer of efficientnetb0 and I trained the model using the functional API of keras again as can be seen above and nothing changed (I have used the sequential API originally). I also tried updating keras and tensorflow to the latest version but still no luck. If you need anything else from the code let me know. I didn't post everything since it's a relatively large chunk of code.
This isn't an actual answer to the problem and doesn't explain why the error occurs, but I've found a workaround. According to fchollet's comment The easiest way to get intermediate results of a hidden layer in a network is to get the weights of that layer after training, build a new model with it and predict your batch of images with that new model. I also tried to write a similar function as above with theano, but that too has all sorts of import issues. Here's the full answer:
One simple way to do it is to use the weights of your model to build a new model that's truncated at the layer you want to read. Then you can run the .predict(X_batch) method to get the activations for a batch of inputs.
Example:
# this is your initial model
model = Sequential()
model.add(Dense(20, 64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dense(64, 1, init='uniform'))
model.add(Activation('softmax'))
# we train it
model.compile(loss='mse', optimizer='sgd')
model.fit(X_train, y_train, nb_epoch=20, batch_size=16)
# we build a new model with the activations of the old model
# this model is truncated after the first layer
model2 = Sequential()
model2.add(Dense(20, 64, weights=model.layers[0].get_weights()))
model2.add(Activation('tanh'))
activations = model2.predict(X_batch)
Note: I haven't tested it.
Another way to do it would be to define a Theano function to get the layer's output:
import theano
get_activations = theano.function([model.layers[0].input], model.layers[1].output(train=False), allow_input_downcast=True)
activations = get_activations(X_batch) # same result as above
Note: also untested.