[SOLVED] How to tune hyperparameters of tf-agents and policies in Tensor Flow?

How to tune hyperparameters of tf-agents and policies in Tensor Flow?

I have set up a python environment that is wrapped in a tensor flow class to make it a tensor flow environment. Then I set up the learning as per the collab notebooks listed here. Currently, I am using the dqn and REINFORCE agent.

The setup works well and the results are somewhat as expected. Now I want to go into the tuning of the hyperparameters like decaying epsilon greedy, weights etc.

I need some pointers on how to use the documentation on how to access these hyperparameters.

Solution

Reinforce doesn't not support epsilon greedy policy, I suggest switching to DQN agent or DDQN.

To pass a specified Q-Network you can use something like:

q_network=q_network.QNetwork(
        environment.time_step_spec().observation['observations'],
        environment.action_spec(),
        fc_layer_params=fc_layer_params)

and pass that to your agent on initialization. For a decaying epsilon-greedy policy you can define your own function decaying_epsilon(train_step, *kwargs) as you prefer. Then initialize your train_step tensor and pass it through functools.partial like this:

train_step = tf.Variable(0, trainable=False, name='global_step', dtype=tf.int64)
partial_decaying_eps = partial(decaying_epsilon, train_step *kwargs)

You can now pass partial_decaying_eps to your agent and it will work as expected, updating with your train_step Tensor progressively. Be sure to pass this same train_step Tensor to your agent though.

Other HP can be modified easily, just look at the DQN documentation in its __init__ function