azureazure-machine-learning-service

When using an online endpoint for an AzureML classification model, how do I get a response including more fields than just the predicted value/proba?


I have a binary classification model deployed to an online endpoint. I can pass data into the endpoint and get it to return the predicted class, or predicted probability.
I want it to return an ID field that is also passed into the model, so that the scores can be matched back to the record they are associated with.

Below is the relevant part of my code to invoke the end point:

data =  {
    "Inputs": {
        "data":
            data1
                },
    "GlobalParameters": {
        "method": "predict_proba"
                        }
        }
body = str.encode(json.dumps(data))

url = 'http://MyDeployement.MyRegion.azurecontainer.io/score'

headers = {'Content-Type':'application/json'}
req = urllib.request.Request(url, body, headers)

response = urllib.request.urlopen(req)
result = response.read()

encoding = response.info().get_content_charset('utf-8')
JSON_object = json.loads(result.decode(encoding))
print(JSON_object)

Thanks.

I have added most of the current scoring script below (as output by the AutoML) as that now seems pertinent:

import...  
  
data_sample = PandasParameterType(pd.DataFrame({"age": pd.Series([0], dtype="int64"), "job": pd.Series(["example_value"], dtype="object"), "marital": pd.Series(["example_value"], dtype="object"), "education": pd.Series(["example_value"], dtype="object"), ...))  
input_sample = StandardPythonParameterType({'data': data_sample})  
method_sample = StandardPythonParameterType("predict")  
sample_global_params = StandardPythonParameterType({"method": method_sample})  
  
result_sample = NumpyParameterType(np.array(["example_value"]))  
output_sample = StandardPythonParameterType({'Results':result_sample})  
  
try:  
    log_server.enable_telemetry(INSTRUMENTATION_KEY)  
    log_server.set_verbosity('INFO')  
    logger = logging.getLogger('azureml.automl.core.scoring_script_v2')  
except:  
    pass  
  
  
def init():  
    global model  
    # This name is model.id of model that we want to deploy deserialize the model file back  
    # into a sklearn model  
    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'model.pkl')  
    path = os.path.normpath(model_path)  
    path_split = path.split(os.sep)  
    log_server.update_custom_dimensions({'model_name': path_split[-3], 'model_version': path_split[-2]})  
    try:  
        logger.info("Loading model from path.")  
        model = joblib.load(model_path)  
        logger.info("Loading successful.")  
    except Exception as e:  
        logging_utilities.log_traceback(e, logger)  
        raise  
  
@input_schema('GlobalParameters', sample_global_params, convert_to_provided_type=False)  
@input_schema('Inputs', input_sample)  
@output_schema(output_sample)  
def run(Inputs, GlobalParameters={"method": "predict"}):  
    data = Inputs['data']  
    if GlobalParameters.get("method", None) == "predict_proba":  
        result = model.predict_proba(data)  
    elif GlobalParameters.get("method", None) == "predict":  
        result = model.predict(data)  
    else:  
        raise Exception(f"Invalid predict method argument received. GlobalParameters: {GlobalParameters}")  
    if isinstance(result, pd.DataFrame):  
        result = result.values  
    return {'Results':result.tolist()}

Solution

  • Since Azure AutoML was updated and had the feature in which you could deselect fields within the training data for consideration within the model, the data you pass to a deployed model will no longer contain the ID field. As order is preserved in Python, the data used for scoring should have the ID field removed before scoring, then the scored data should be joined back on the original file to get the IDs.

    Read the file to score into a data frame, drop the ID column, save that appropriately or pass directly to the endpoint as required. Then read the original file & output, joining the ID from the prior & score from the latter into a new data frame to achieve the goal of score associated with ID. Then you can write that or use it as appropriate.