in Bounding hyperparameter optimization with Tensorflow bijector chain in GPflow 2.0, I found an excellent explanation of how to set boundaries to my hyperparameters.
Unfortunately, I noticed that using the tensorflow_probability.bijectors.Sigmoid
transform causes numerical instabilities which lead to parameter values outside [low, high]
for me.
My current workaround is to define my own sigmoid transform that uses the alternative implementation in the comments of the tensorflow_probability
source code:
import tensorflow as tf
from tensorflow_probability import bijectors as tfb
from tensorflow_probability import math as tfm
class mySigmoid(tfb.Sigmoid):
def _stable_sigmoid(x):
"""A (more) numerically stable sigmoid than `tf.math.sigmoid`."""
x = tf.convert_to_tensor(x)
if x.dtype == tf.float64:
cutoff = -20
else:
cutoff = -9
return tf.where(x < cutoff, tf.exp(x), tf.math.sigmoid(x))
def _forward(self, x):
if self._is_standard_sigmoid:
return self._stable_sigmoid(x)
lo = tf.convert_to_tensor(self.low) # Concretize only once
hi = tf.convert_to_tensor(self.high)
ans = hi * tf.sigmoid(x) + lo * tf.sigmoid(-x)
return tfb.math.clip_by_value_preserve_gradient(ans, lo, hi)
It is noted in the Tensorflow source that this approach has some drawbacks, however, so I wanted to ask if there are alternative ways to bound hyperparameters in GPFlow?
Another way to constrain a parameter in GPflow is to place a prior on it. For example:
k = gpflow.kernels.Matern32()
k.variance.prior = tfp.distributions.Gamma(to_default_float(2), to_default_float(3))
See more here.
Whether a prior is an appropriote solution depends on the details of what you're trying to accomplish.