This is apparently very obvious and basic, because I can't find any tutorials on it, but how do I set up a state space for a q-learning environment?
If I understand correctly, every state needs to be associated with a single value, right? If so, what do I do if I have more than one input variable? In essence;
stateSpace = ???
Once I do have a state space, how do I change a state? Say it's based on 3 variables, V1 V2 and V3. The q-learning algorithm only receives a single-number representation of this state, right? How do I use the variables and state space to return a single value representing a state?
I'm sorry if this is obvious/basic, thank you for your time.
I think you might be a bit confused regarding the parameters involved in Q Learning. Here's what we have:
Reward: The reward given to the agent for entering a state. This can be positive or negative but should be a single number.
State: All the relevant information about the state of the game.
Observation: A tensor containing the information the agent is allowed to know about the state of the game.
Q-Value: The "quality" of taking a certain action.
We can train the network by comparing what we expect the quality of a certain action to be (How much it improves our reward in the long term) and what we actually found it to be after making that move.
On each tick, we're updating the state and then the agent makes a new observation that gives it new input values to work with.