reinforcement-learningdqn

Question about the reinforcement learning action, observation space size


I tried to custom environment with a reinforcement learning(RL) project.

Some examples such as ping-pong, Aarti, Super-Mario, in this case, action, and observation space really small.

But, my project action, observation space is really huge size better than some examples.

And, I will use the space for at least 5000+ actions and observations.

Then, how can I effectively handle this massive amount of action and observation?

Currently, I am using Q-table learning, so I use a wrapper function to handle it.

But this seems to be very ineffective.


Solution

  • Yes, Q-table learning is quite old and requires extremely huge amount of memory since it stores Q value in a table. In your case, Q-table Learning seems not good enough. A better Choice would be Deep Q Network(DQN), which replaces table by networks, but it is not that efficient.

    As for the huge observation space, it is fine. But the action space (5000+) seems too huge, it requires lots of time to converge. To reduce the time used for training, I would recommend PPO.