kerasdeep-learningneural-networkactivation-function

Why does almost every Activation Function Saturate at Negative Input Values in a Neural Network


This may be a very basic/trivial question.

For Negative Inputs,

  1. Output of ReLu Activation Function is Zero
  2. Output of Sigmoid Activation Function is Zero
  3. Output of Tanh Activation Function is -1

Below Mentioned are my questions:

  1. Why is it that all of the above Activation Functions Saturated for Negative Input Values.
  2. Is there any Activation Function if we want to predict a Negative Target Value.

Thank you.


Solution

    1. True - ReLU is designed to result in zero for negative values. (It can be dangerous with big learning rates, bad initialization or with very few units - all neurons can get stuck in zero and the model freezes)

    2. False - Sigmoid results in zero for "very negative" inputs, not for "negative" inputs. If your inputs are between -3 and +3, you will see a very pleasant result between 0 and 1.

    3. False - The same comment as Sigmoid. If your inputs are between -2 and 2, you will see nice results between -1 and 1.


    So, the saturation problem only exists for inputs whose absolute values are too big.

    By definition, the outputs are:

    You might want to use a BatchNormalization layer before these activations to avoid having big values and avoid saturation.


    For predicting negative outputs, tanh is the only of the three that is capable of doing that.

    You could invent a negative sigmoid, though, it's pretty easy:

    def neg_sigmoid(x):
        return -keras.backend.sigmoid(x)
    
    #use the layer:
    Activation(neg_sigmoid)