This seems like it should be obvious but I can't find resources on it anywhere. I am building a reinforcement learning model with openai's gym any_trading environment and stablebaselines3. There are a ton of online tutorials and documentation for training and evaluating the model but almost nothing on actually using it in practice.
e.g. I want the model constantly looking at today's
data and making predictions about what action I should take to lock in tomorrow's
profits.
Reinforcement learning algorithms seem to all have a model.predict()
method but you have to pass the environment which is just more historical data. What if I want it to use today's data to predict tomorrow's values? Do I just include up to today in the test set and retrain the model from scratch each time I want it to make a prediction?
e.g. Original training data ranges from 2014-01-01 to Today (aka 2023-02-12)
then run through the whole train and testing process? Then tomorrow I start from scratch and train/test using date_ranges 2014-01-01 to Today (aka 2023-02-13)
then the next day 2014-01-01 to Today (aka 2023-02-14)
etc etc? How do I actually make real-time predictions with a Reinforcement Learning model as opposed to continually evaluating how it would have performed on past data?
Thanks.
This is a very good and practical question. I assume you use all the history data to train your RL agent in stablebaselines3 in practice and then apply the trained RL agent to predict tomorrow's action. Short answer is NO, you don't need to train your agent from scratch every day.
First you need to understand the procedure in learning and prediction:
In learning or training process:
In prediction process (which has only procedure 2,3 from training process):
You can repeated train your model in the training process with the history data until you are satisfied with the performance during the training process. In this retrain process, after each training through the entire history data, you can save the model and load the saved model for the retrain as the initialized model.
Once you get that good model, you don't need to train it anymore with the new coming data after 2023-2-12. It is still valid.
You may think new data is generated everyday and the most recent data is the most valuable one. In this case, you can periodically update your existing model with the new data using following procedure: