IndexError: arrays used as indices must be of integer (or boolean) type with OpenAI gym

I'm working on an OpenAI Gym project, and I've been writing some code to solve an environment. However, I keep getting the error that says "IndexError: arrays used as indices must be of integer (or boolean) type", referring to my index of an array, I'm assuming.

I've tried ensuring that my variable "state" is an integer by doing print(type(state)) and it said that it was an integer data type.

My code and the full error are listed below:

import gym
import numpy as np
import pickle as pkl

mountEnv = gym.make("MountainCar-v0", render_mode = "human")

q_table = np.zeros(shape = (48,4))
# Parameters
EPSILON = 0.1
ALPHA = 0.9
GAMMA = 0.9
NUM_EPISODES = 500

def policy(state, explore=0.0):
    action = int(np.argmax(q_table[state]))
    if np.random.random() <= explore:
        action = int(np.random.randint(0,2))
    return action

for episode in range(NUM_EPISODES):

    done = False
    total_reward = 0
    episode_length = 0

    mountEnv.reset()
    state = 0
    while not done:
        action = policy(state, EPSILON)
        next_state, reward, done, _, _ = mountEnv.step(action)
        next_action = policy(next_state)

        q_table[state][action] += ALPHA + (reward + GAMMA * q_table[next_state][next_action] - q_table[state][action])

        state = next_state

        total_reward += reward
        episode_length += 1
    print("Episode:", episode, "Length:", episode_length, "Total Reward: ", total_reward)

mountEnv.close()
pkl.dump(q_table, open("q_learning_q_table.pkl", "wb"))

The error message I'm getting looks like this:

/usr/local/lib/python3.10/dist-packages/gym/utils/passive_env_checker.py:233: DeprecationWarning: `np.bool8` is a deprecated alias for `np.bool_`.  (Deprecated NumPy 1.24)
  if not isinstance(terminated, (bool, np.bool8)):
Traceback (most recent call last):
  File "/home/matteo/Downloads/Q-Learning-and-SARSA-Mountain-Car-v0-main/matteos-code.py", line 31, in <module>
    next_action = policy(next_state)
  File "/home/matteo/Downloads/Q-Learning-and-SARSA-Mountain-Car-v0-main/matteos-code.py", line 15, in policy
    action = int(np.argmax(q_table[state]))
IndexError: arrays used as indices must be of integer (or boolean) type

Process finished with exit code 1

Solution

Your attempt to identify the variable as an integer fails as it doesn't actually check the error, the value is obtained from next_state. You should check the variable right before it fails, not outside, afterwards. In fact, you should practice using a IDE that lets you set debug point, and step into the code to follow exactly what value all your variables have at all points in time.

Your actual core issue here is that you expect cliff walking on a grid and driving a car up a mountain to have any similarity at all; like trying to use a chess engine to control a self driving car.

As you can tell from the documentation linked, the cliff walking has 4 discrete actions, up, right, down, left. The car driving up the continuous cliff has 2 continuous values, the position along the x-axis and the velocity.

None of your decision making code makes any sense at all when applied to the mountain car problem.