I have multiple models in Google Vertex AI and I want to create an endpoint to serve my predictions. I need to run aggregation algorithms, like the Voting algorithm on the output of my models. I have not found any ways of using the models together so that I can run the voting algorithms on the results. Do I have to create a new model, curl my existing models and then run my algorithms on the results?
There is no in-built provision to implement aggregation algorithms in Vertex AI. To curl
results from the models then aggregate them, we would need to deploy all of them to individual endpoints. Instead, I would suggest the below method to deploy the models and the meta-model(aggregate model) to a single endpoint using custom containers for prediction. The custom container requirements can be found here.
You can load the model artifacts from GCS into a custom container. If the same set of models are used (i.e) the input models to the meta-model do not change, you can package them inside the container to reduce load time. Then, a custom HTTP logic can be used to return the aggregation output like so. This is a sample custom flask server logic.
def get_models_from_gcs():
## Pull the required model artifacts from GCS and load them here.
models = [model_1, model_2, model_3]
return models
def aggregate_predictions(predictions):
## Your aggregation algorithm here
return aggregated_result
@app.post(os.environ['AIP_PREDICT_ROUTE'])
async def predict(request: Request):
body = await request.json()
instances = body["instances"]
inputs = np.asarray(instances)
preprocessed_inputs = _preprocessor.preprocess(inputs)
models = get_models_from_gcs()
predictions = []
for model in models:
predictions.append(model.predict(preprocessed_inputs))
aggregated_result = aggregate_predictions(predictions)
return {"aggregated_predictions": aggregated_result}