pythonrestendpointazure-machine-learning-service

Query a specific model deployment in an endpoint inside Azure Machine Learning


I'm working in Python. I already successfully deployed two model deployments in an Azure Machine Learning real-time endpoint, with a 50-50 traffic split.

Using the Python SDK, I know that to call a specific model deployment I have to specify the deployment_name parameter in the invoke method (tutorial and docs).

However, if I want to do the same operation using REST APIs, I did not find any parameter to be passed to the REST endpoint query string. Therefore, the model deployment is randomly chosen according to the traffic split.

Code example:

import requests
import json

endpoint = "https://my-endpoint.westeurope.inference.ml.azure.com/score"
api_key = "xyz"

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {api_key}"
}

data = {
    "input_data": {
        "columns": [
            "a",
            "b"
            ],
        "index": [0, 1],
        "data": [
            [0.00, 185.0],
            [18.00, 181.0]
        ]
    }
}

response = requests.post(endpoint, headers=headers, data=str.encode(json.dumps(data)))
print(response.json())

Since there is the invoke method for the Python SDK, I expected there was a parameter to be added to the endpoint query string, e.g. endpoint = https://my-endpoint.westeurope.inference.ml.azure.com/score?deployment=model1 - I appended deployment=model1 at the end - but I did not find anything in the docs.

Any suggestions? Does anyone know if this option is available?


Solution

  • According to this documentation you need to use the header azureml-model-deployment and it's value.

    Alter you code like below.

    headers = {
        "Content-Type": "application/json",
        "azureml-model-deployment":"<your-deployment-name>",
        "Authorization": f"Bearer {api_key}"
    }
    
    data = {
        "input_data": {
            "columns": [
                "a",
                "b"
                ],
            "index": [0, 1],
            "data": [
                [0.00, 185.0],
                [18.00, 181.0]
            ]
        }
    }
    
    response = requests.post(endpoint, headers=headers, data=str.encode(json.dumps(data)))
    print(response.json())
    

    Output for different deployments.

    Deployment blue

    enter image description here

    Deployment red

    enter image description here

    Gives deployment not found, because i haven't done deployment red.