I'm working in Python. I already successfully deployed two model deployments in an Azure Machine Learning real-time endpoint, with a 50-50 traffic split.
Using the Python SDK, I know that to call a specific model deployment I have to specify the deployment_name
parameter in the invoke
method (tutorial and docs).
However, if I want to do the same operation using REST APIs, I did not find any parameter to be passed to the REST endpoint query string. Therefore, the model deployment is randomly chosen according to the traffic split.
Code example:
import requests
import json
endpoint = "https://my-endpoint.westeurope.inference.ml.azure.com/score"
api_key = "xyz"
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}"
}
data = {
"input_data": {
"columns": [
"a",
"b"
],
"index": [0, 1],
"data": [
[0.00, 185.0],
[18.00, 181.0]
]
}
}
response = requests.post(endpoint, headers=headers, data=str.encode(json.dumps(data)))
print(response.json())
Since there is the invoke
method for the Python SDK, I expected there was a parameter to be added to the endpoint query string, e.g. endpoint = https://my-endpoint.westeurope.inference.ml.azure.com/score?deployment=model1
- I appended deployment=model1
at the end - but I did not find anything in the docs.
Any suggestions? Does anyone know if this option is available?
According to this documentation you need to use the header
azureml-model-deployment
and it's value.
Alter you code like below.
headers = {
"Content-Type": "application/json",
"azureml-model-deployment":"<your-deployment-name>",
"Authorization": f"Bearer {api_key}"
}
data = {
"input_data": {
"columns": [
"a",
"b"
],
"index": [0, 1],
"data": [
[0.00, 185.0],
[18.00, 181.0]
]
}
}
response = requests.post(endpoint, headers=headers, data=str.encode(json.dumps(data)))
print(response.json())
Output for different deployments.
Deployment blue
Deployment red
Gives deployment not found, because i haven't done deployment red
.