pythontensorflowreinforcement-learningdqn

How to tune hyperparameters of tf-agents and policies in Tensor Flow?


I have set up a python environment that is wrapped in a tensor flow class to make it a tensor flow environment. Then I set up the learning as per the collab notebooks listed here. Currently, I am using the dqn and REINFORCE agent.

The setup works well and the results are somewhat as expected. Now I want to go into the tuning of the hyperparameters like decaying epsilon greedy, weights etc.

I need some pointers on how to use the documentation on how to access these hyperparameters.


Solution

  • Reinforce doesn't not support epsilon greedy policy, I suggest switching to DQN agent or DDQN.

    To pass a specified Q-Network you can use something like:

    q_network=q_network.QNetwork(
            environment.time_step_spec().observation['observations'],
            environment.action_spec(),
            fc_layer_params=fc_layer_params)
    

    and pass that to your agent on initialization. For a decaying epsilon-greedy policy you can define your own function decaying_epsilon(train_step, *kwargs) as you prefer. Then initialize your train_step tensor and pass it through functools.partial like this:

    train_step = tf.Variable(0, trainable=False, name='global_step', dtype=tf.int64)
    partial_decaying_eps = partial(decaying_epsilon, train_step *kwargs)
    

    You can now pass partial_decaying_eps to your agent and it will work as expected, updating with your train_step Tensor progressively. Be sure to pass this same train_step Tensor to your agent though.

    Other HP can be modified easily, just look at the DQN documentation in its __init__ function