pandasfacebook-prophet

Facebook NeuralProphet - Loading model from pickle for prediction


I have a weekly Job which reads data from a csv file and create model based on NeuralProphet and dump the pickle file for the later use.

from neuralprophet import NeuralProphet
from matplotlib import pyplot as plt
import pandas as pd
import pickle

data_location = /input_data/
df = pd.read_csv(data_location + 'input.csv')

np = NeuralProphet()
model = np.fit(df, freq="5min")

with open('model/neuralprophet_model.pkl', "wb") as f:
     # dump information to that file
     pickle.dump(model, f)

The above python code runs on a weekly basis and it dumps the model file in a file.

Now, i have a different python file which loads the pickle file and does the prediction for the future date.

Lets say, I have last 2 years data in a csv file and created model from that. Now, I would like to predict the future based on the above model.

from neuralprophet import NeuralProphet
import pandas as pd
import pickle

with open('model/neuralprophet_model.pkl', "rb") as f:
     model = pickle.load(file)

# To get a next 1 hour prediction by 5mins interval 
future = model.make_future_dataframe(periods=12, freq='5min')
forecast = model.predict(future)

Is this correct? Here, I dont pass the data to make_future_dataframe. But, all the internet example passes the data as well. Since, the data was used to train the model, I am just using the model here. Why do we need to pass data also here as we use predict(For some unknown future date) based on the model?


Solution

  • The NeuralProphet model (pickle file) is just a trained neural network... the most simple analogy would be a training linear regression model (from sci-kit learn etc)... y = Ax + b where you have trained A and b vectors. These vectors alone cannot produce y without x. Your model in this example is just the A and b vectors. Now, neuralprophet uses auto-regressive feed forward neural networks, so there are more vector terms and they are not all linear.

    That's why NeuralProhpet requires historic data in model.fit... the historic data is x. x can be from the same dataset that you used for training A and b, or x can be from a different but statistically similar dataset (You can use d-bar testing to determine and confidence intervals to determine similarity here).

    This is how we use models across most supervised learning applications... train on one sample dataset and apply to predict outcomes on similar datasets.