pythonmlflowcustom-function

Why does my PythonModel expect 3 arguments but only 2 are defined?


I try to create a PythonModel by

  1. creating a class
  2. log it using mlflow

The class looks like:

# Model wrapper class
class ModelWrapper(mlflow.pyfunc.PythonModel):
def __init__(self):
self.generator = None

    def predict(self, json_input):
    
        # preprocess
        df = function1(self, json_input) # returns a pandas df
    
        # calculation
        result = function12(self, df) # returns the result
    
        return result

The problem comes up, when I try to log the model with mlflow:

with mlflow.start_run() as run:     
    mlflow.pyfunc.log_model("custom_model",
                         python_model=ModelWrapper(),
                         input_example=json_input)

It gives me the Error / Warning: WARNING mlflow.utils.requirements_utils: Failed to run predict on input_example, dependencies introduced in predict are not captured. TypeError('ModelWrapper.predict() takes 2 positional arguments but 3 were given')Traceback (most recent call last): ...

When calling the class directly using:

model_wrapper = ModelWrapper() 
print(model_wrapper.predict(json_input))

I get the desired output.

But when I try to log the model or load and then call the predict function I get the mentioned error.

Does anyone know why or what the 3rd argument is, since I only give "json_input" and "self" to the function?


Solution

  • First of all, self is a special class argument. In classes, when you define a method with self as an input (ex/ def myfunction(self, myarg)), you have to call it as a an attribute of the class (ex/ self.myfunction(myarg)), not as a function. This does two things:

    1. It calls the class method, not a method external to the class
    2. It inputs self as the first argument of the method call

    Fixed class definition:

    # Model wrapper class
    class ModelWrapper(mlflow.pyfunc.PythonModel):
        def __init__(self):
            self.generator = None
    
        def predict(self, json_input):
        
            # preprocess
            df = self.function1(json_input) # returns a pandas df
        
            # calculation
            result = self.function12(df) # returns the result
        
            return result
    
    

    Second, if that doesn't fix the issue, you may need to add additional arguments to your function definition. The way input_example is being called may be sending an unwanted argument when it propagates the call to json_input.

    You can always add *args and **kwargs input to your predict method definition to see what arguments are actually being passed by log_model:

    <-SNIP->

        # find out what is being passed to your function:
        def predict(self, *args, **kwargs):
            for value in [*args]:
                print(value)
            for key in kwargs:
                print(key, kwargs[key])
    

    <-SNIP->

    Once you know what is being sent, you can rearrange your inputs to your liking. For example if you find out that json_input is arg[0] you would make your definition like this:

    <-SNIP->

        def predict(self, my_value, event_value):
    

    <-SNIP->