I am evaluating 'env = gym.make('MountainCar-v0', render_mode='rgb_array')' on random agent and getting 100% success rate that is weired can you please guide me on this.
Means always get car position greater than 0.5 which is the goal.
env = gym.make('MountainCar-v0', render_mode='rgb_array')
state, _ = env.reset()
max_position = -99
done = False
while not done:
action = env.action_space.sample()
next_state, reward, done, trun, info = env.step(action)
if next_state[0] > max_position:
max_position = next_state[0]
print(next_state[0], max_position)
0.50212365 0.50212365
I compared with other problems they are getting 0% success rate with random agent
I don't know if I understand correctly but I checked the Mountain Car environment. The goal is to reach a state-value greater than 0.5
The while
-loop only ends when done==true
. The boolean done is true only when the step function returns a next_state value greater than 0.5.
That is why max_position is always greater than 0.5 even when actions are chosen randomly. The agent acts randomly until it casually reaches the goal. The loop doesn't end until a value greater than 0.5 is found. The loop would be faster if actions were chosen based on acquired knowledge (reinforcement learning) and not randomly.