pythonreinforcement-learningopenai-gymstable-baselinesddpg

How to correctly define this Observation Space for the custom Gym environment I am creating using Gym.Scpaces.Box?


I am trying to implement DDPG algorithm of the Paper.

Here in the image below, gk[n] and rk[n] are KxM matrices of real values. Theta[n] and v[n] are arrays of size M.

I want to write correct code to specify state/observation space in my custom environment.

Since the data type input to the neural network needs to be unified, the state array can be expressed as

The state Space definition as mentioned in the Paper

observation_space = spaces.Box(low=0, high=1, shape=(K, M), dtype=np.float16......)

I am stuck.


Solution

  • If you use stable-baselines3, you may use a Dict observation space filled with Boxes with meaningful limits for all your vectors and matrices (if limits are unknown, you may always use +inf/-inf). The code could be something like:

    from gym import Env
    from gym.spaces import Box, Dict
    
    class MySuperGymEnv(Env):
      def __init__(self):
        ...
        spaces = {
           'theta': Box(low=0, high=1, shape=(99,), dtype=np.float32),
           'g': Box(low=0, high=255, shape=(100,200), dtype=np.float32),
           ...
        }
        self.observation_space = Dict(spaces)
        ...