I am trying to use mlflow
to save a model and then load it later to make predictions.
I'm using a xgboost.XGBRegressor
model and its sklearn functions .predict()
and .predict_proba()
to make predictions but it turns out that mlflow
doesn't support models that implements the sklearn API, so when loading the model later from mlflow, mlflow returns an instance of xgboost.Booster
, and it doesn't implements the .predict()
or .predict_proba()
functions.
Is there a way to convert a xgboost.Booster
back into a xgboost.sklearn.XGBRegressor
object that implements the sklearn API functions?
Have you tried wrapping up your model in custom class, logging and loading it using mlflow.pyfunc.PythonModel
?
I put up a simple example and upon loading back the model it correctly shows <class 'xgboost.sklearn.XGBRegressor'>
as a type.
Example:
import xgboost as xgb
xg_reg = xgb.XGBRegressor(...)
class CustomModel(mlflow.pyfunc.PythonModel):
def __init__(self, xgbRegressor):
self.xgbRegressor = xgbRegressor
def predict(self, context, input_data):
print(type(self.xgbRegressor))
return self.xgbRegressor.predict(input_data)
# Log model to local directory
with mlflow.start_run():
custom_model = CustomModel(xg_reg)
mlflow.pyfunc.log_model("custome_model", python_model=custom_model)
# Load model back
from mlflow.pyfunc import load_model
model = load_model("/mlruns/0/../artifacts/custome_model")
model.predict(X_test)
Output:
<class 'xgboost.sklearn.XGBRegressor'>
[ 9.107417 ]