pythonkerasneural-networkautoencoder

Custom Activation with custom gradient does not work


I am trying to write a code for a simple neural network training. The goal is to define a custom activation function and instead of letting Keras take the derivative of it automatically for the backpropagation, I make Keras use my custom gradient function for my custom activation:

import numpy as np
import tensorflow as tf
import math
import keras
from keras.models import Model, Sequential
from keras.layers import Input, Dense, Activation
from keras import regularizers
from keras import backend as K
from keras.backend import tf
from keras import initializers
from keras.layers import Lambda

@tf.custom_gradient
def custom_activation(x):

    def grad(dy):
        return dy * 0

    result=(K.sigmoid(x) *2-1 )
    return result, grad 

x_train=np.array([[1,2],[3,4],[3,4]]);

inputs = Input(shape=(2,))
output_1 = Dense(20, kernel_initializer='glorot_normal')(inputs)
layer = Lambda(lambda x: custom_activation)(output_1)
output_2 = Dense(2, activation='linear',kernel_initializer='glorot_normal')(layer)
model2 = Model(inputs=inputs, outputs=output_2)

model2.compile(optimizer='adam',loss='mean_squared_error')
model2.fit(x_train,x_train,epochs=20,validation_split=0.1,shuffle=False)

Since the gradient has been defined to be zero, I expect that the loss does not change after all epochs. Here is the backtrace of the error I get:

Using TensorFlow backend.
WARNING:tensorflow:From C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
Traceback (most recent call last):
  File "C:/p/CE/mytest.py", line 43, in <module>
    layer = Lambda(lambda x: custom_activation)(output_1)
  File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\base_layer.py", line 474, in __call__
    output_shape = self.compute_output_shape(input_shape)
  File "C:\ProgramData\Anaconda3\lib\site-packages\keras\layers\core.py", line 656, in compute_output_shape
    return K.int_shape(x)
  File "C:\ProgramData\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py", line 593, in int_shape
    return tuple(x.get_shape().as_list())
AttributeError: 'function' object has no attribute 'get_shape'

Update: I used Manoj Mohan's answer and now the code works. I expect to see unchanged loss among epochs since the gradient is defined to be zero. But, it does change. Why? Am I missing anything?

Example:

Epoch 1/20
2019-10-03 10:31:34.193232: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2

2/2 [==============================] - 0s 68ms/step - loss: 8.3184 - val_loss: 13.7232
Epoch 2/20

2/2 [==============================] - 0s 496us/step - loss: 8.2783 - val_loss: 13.6368

Solution

  • Replace

    layer = Lambda(lambda x: custom_activation)(output_1)
    

    with

    layer = Lambda(custom_activation)(output_1)
    

    I expect to see unchanged loss among epochs since the gradient is defined to be zero. But, it does change. Why?

    The gradient was updated to zero in an intermediate layer. So, the gradients will not flow backwards from there. But from the output till the intermediate layer, gradient will flow and weights will get updated. This modified architecture, will output constant loss across epochs.

    inputs = Input(shape=(2,))
    output_1 = Dense(20, kernel_initializer='glorot_normal')(inputs)
    output_2 = Dense(2, activation='linear',kernel_initializer='glorot_normal')(output_1)
    layer = Lambda(custom_activation)(output_2)  #should be last layer
    model2 = Model(inputs=inputs, outputs=layer)