algorithmreinforcement-learningq-learningfunction-approximation

Q-learning with linear function approximation


I would like to get some helpful instructions about how to use the Q-learning algorithm with function approximation. For the basic Q-learning algorithm I have found examples and I think I did understand it. In case of using function approximation I get into trouble. Can somebody give me an explanation through a short example how it works?

What I know:

  1. Istead of using matrix for Q-values we use features and parameters.
  2. Make approximation with the linear combination of feauters and parameters.
  3. Update the parameters.

I have checked this paper: Q-learning with function approximation

But I cant find any useful tutorial how to use it.

Thanks for help!


Solution

  • To my view, this is one of the best references to start with. It is well written with several pseudo-code examples. In your case, you can simplify the algorithms by ignoring eligibility traces.

    Also, to my experience and depending on your use case, Q-Learning might not work very well (sometimes it needs huge amounts of experience data). You can try Fitted-Q value for example, which is a batch algorithm.