reinforcement-learningtorchopenai-gym

How can I access additional information about my gymnasium environment during a torchrl rollout?


I am currently training a PPO algorithm in my custom gymnasium environment with the purpose of a pursuit-evasion game. During the training process however, I want to periodically evaluate the progress of my policy and visualize the results in the form of a trajectory. (My code basically follows this official tutorial, incase you need more context)

        if i % 10 == 0:
            with set_exploration_type(ExplorationType.MEAN), torch.no_grad():
                eval_rollout = env.rollout(1000, policy_module)

The 'rollout' method returns a tensordict object, that contains some information about the given trajectory like the actions taken by the agent (the evader) or the observations. Those observations I choose however, are relative metrics but I would need absolute positions of the evader and the pursuer in order to be able to plot their behaviour. I'd love to include absolute positions somewhere in this tensordict, similar to how it's possible to add an info variable to the step-method inside gymnasium environments:

    def _get_info(self):
        return {'evader_pos': self.evager.get_position(),
                'pursuer-pos': self.pursuer.get_position()
        }

    def step(self, action):
          . . . 

        return observation, reward, terminated, False, **self.get_info()**

I went over both the torchrl documentation and the source code but didn't seem to find anything related to my intentions. So my question: Is there an alternative I can try or did I oversee something that provides additional information to the rollout-tensordict?

I tried the obvious: Reconstructing the absolute positions from the relative metrics, this was not possible however as I would have needed to introduce some new (redundant) information which I deemed not to be beneficial.

Thanks alot for your time!


Solution

  • Assuming you are using torchrl's 'GymEnv' class when initiating your custom environment you can use 'set_info_dict_reader' and pass it the key you also provide in the info-method of your custom environment.

    base_env = GymEnv("env_string-v0", device=device, frame_skip=frame_skip)
    
    reader = default_info_dict_reader(["my_info_key"])
    base_env = base_env.set_info_dict_reader(info_dict_reader=reader)
    

    This way you can include your absolute positions in the the resulting tensordict from the rollout method. Hope that helps!