So I want to try out an adaptive activation function for my neural network. This means I want to have a custom loss that is similar to a standard one (like tanh or relu), however I want to add some trainable parameters.
Currently, I am trying to add this trainable parameter by creating the activation function as a custom layer:
class AdaptiveActivation(keras.layers.Layer):
> """
Adaptive activation function that is changed in training process.
> """
def __init__(self, act="tanh"):
super(AdaptiveActivation, self).__init__()
self.a = tf.Variable(0.1, dtype=tf.float32, trainable=True)
self.n = tf.constant(10.0, dtype=tf.float32)
self.act = act
def call(self, x):
if self.act == "tanh":
return keras.activations.tanh(self.a*self.n*x)
elif self.act == "relu":
return keras.activations.relu(self.a*self.n*x)
However - if I understood some test outputs correctly - this means every time I call the activation function, there will be a unique parameter a. This means for every hidden layer, I get a different a. What I want, is one single a for all my activation functions. So instead of say 9 different values for a per epoch, just always one a that can change between epochs.
Furthermore, is there an easy way to obtain the a from this layer for output during training?
ok the solution was stupidly easy, I can just pass a trainable tensorflow variable to the layer from outside and assign it to the self.a there.
class AdaptiveActivation(keras.layers.Layer):
"""
Adaptive activation function that is changed in training process.
"""
def __init__(self, a, act="tanh"):
super(AdaptiveActivation, self).__init__()
self.a = a
self.n = tf.constant(5.0, dtype=tf.float32)
self.act = act
def call(self, x):
if self.act == "tanh":
return keras.activations.tanh(self.a*self.n*x)
elif self.act == "relu":
return keras.activations.relu(self.a*self.n*x)
This also solves the "issue" of tracking it.
It does feel very unnecessary though, why couldn't I just have done this without having to implement a new layer first.