pythonpytorchopenai-gymstable-baselines

stable_baselines3 PPO model crashes during training due to error in dummy_vec_env.py


I am attempting to train a PPO model on the CartPole-v1 environment.

import gym
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import DummyVecEnv, VecNormalize
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3.common.evaluation import evaluate_policy

env_id = "CartPole-v1"
#Making the environment
envs = make_vec_env(env_id, n_envs= 4)
envs = VecNormalize(envs)

#Training the model
model = PPO(policy="MlpPolicy", env=envs, verbose=1)
model.learn(1000)
model.save("CartPole-v1-model")
envs.save("CartPole-v1-env")

I get this error message:

Error message

I have pytorch with cpu only installed which I suspect is the cause of the error. However, the source code for dummy_vec_env.py and the parent base_vec_env.py do not import pytorch at all so I am not sure pytorch is the cause.

I copied the code and was successful in getting it to run in a HuggingFace google colab notebook https://colab.research.google.com/github/huggingface/deep-rl-class/blob/master/notebooks/unit1/unit1.ipynb so I am extremely confused why it does not work on my local machine.

I checked the debugger for dummy_vec_env.py and I got a tuple in the variable obs. debugger

Any help would be much appreciated!


Solution

  • The problem is caused by the version of stable_baselines3 from conda. My stable_baselines3 was version 1.1.0.

    Installing a later version of stable_baselines3 using pip solves the problem. I used

    pip install stable-baselines3==2.0.0a5
    

    Note: I installed 2.0.0a5 to follow the HuggingFace Google Collab page but there are later versions of stable_baselines3 that most likely also work.