pythonazuremachine-learningazure-machine-learning-servicemlflow

Load a Registered Model in Azure ML Studio in an Interactive Notebook


I'm using Azure Machine Learning Studio and I have an sklearn mlflow model stored in my default datastore (blob storage) which I have then registered as a model asset. How can I load this model inside an interactive notebook to perform some quick model inferencing and testing before deploying this as a batch endpoint.

I have seen a post linked here that suggests downloading the model artefacts locally but I shouldn't need to do this. I should be able to load the model directly from the datastore or the registered asset without the need to duplicate the model in multiple places. I have tried the following without success.

Reading from Registered Model Asset

import mlflow
from azure.ai.ml import MLClient
from azure.ai.ml.entities import Model

ml_client = MLClient(DefaultAzureCredential(), "<subscription_id>", "<resource_group>", "<workspace_id>")

model = ml_client.models.get("<model_name>", version="1")
loaded_model = mlflow.sklearn.load_model(model.id)

>>> OSError: No such file or directory: ...

Reading from Datastore

import mlflow

model_path = "<datastore_uri_to_model_folder>"
loaded_model = mlflow.sklearn.load_model(model_path)

>>> DeserializationError: Cannot deserialize content-type: text/html

Solution

  • According to this documentation any one of the paths should be given for loading the model.

    But you are giving model id and datastore path which is not supported.

    So, try this code.

    loaded_model = mlflow.sklearn.load_model("models:/local-mlflow-example/1")
    loaded_model.predict(sample_data["data"])
    

    Output:

    enter image description here

    Here the path should be models:/<model_name>/<model_version>

    model.id or model.path or datastore path is supported when using azure ml jobs context.

    So, to use model.id or model.path you submit the command job like below.

    from azure.ai.ml import command
    from azure.ai.ml.entities import Model
    from azure.ai.ml.constants import AssetTypes
    from azure.ai.ml import Input, Output
    
    inputs = {
        "input_data": Input(
            type=AssetTypes.URI_FILE, path="./mlflow-model/input_example.json"
        ),
        "input_model": Input(type=AssetTypes.MLFLOW_MODEL, path=model.path),
    }
    
    outputs = {
        "output_folder": Output(
            type=AssetTypes.URI_FOLDER,
            path=f"azureml://subscriptions/{subscription_id}/resourcegroups/{resource_group}/workspaces/{workspace}/datastores/workspaceblobstore/paths/predictions",
        )
    }
    
    job = command(
        code="./src",  # local path where the code is stored
        command="python load_score.py --input_model ${{inputs.input_model}} --input_data ${{inputs.input_data}} --output_folder ${{outputs.output_folder}}",
        inputs=inputs,
        outputs=outputs,
        environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu:1",
        compute="cpu-cluster",
    )
    
    # submit the command
    returned_job = ml_client.jobs.create_or_update(job)
    # get a URL for the status of the job
    returned_job.studio_url
    

    load_score.py script which loads the model and prints the result.

    import argparse
    import pandas as pd
    import mlflow.sklearn
    import pandas as pd
    import json
    import os
    
    parser = argparse.ArgumentParser()
    parser.add_argument("--input_data", type=str)
    parser.add_argument("--input_model", type=str)
    parser.add_argument("--output_folder", type=str)
    args = parser.parse_args()
    
    with open(args.input_data) as f:
        sample_data = json.load(f)
    
    f.close()
    
    print(sample_data)
    
    sk_model = mlflow.sklearn.load_model(args.input_model)
    predictions = sk_model.predict(sample_data["data"])
    
    # Writing to stdout
    print(predictions)
    
    with open(os.path.join(args.output_folder, "predictions.txt"), "x") as output:
        # Writing data to a file
        output.write(str(predictions))
    output.close()
    

    Job output: enter image description here

    Refer this notebook for more information.

    If you face no such file error that is because of the dependencies required is not present like MLmodel or model.pkl etc is deleted in storage account or moved to other folders.