google-cloud-platform google-cloud-automl google-cloud-vertex-ai

GCP AI Platform API - Object Detection Metrics at Class Level (Python)

I have trained a AutoML Object Detection model in Vertex AI (a service under AI Platform in GCP). I am trying to access model evaluation metrics for each label (precision, recall, accuracy etc.) for varying Confidence Score Threshold and IoU Threshold.

However, I am stuck at step one, even to get model's aggerate performance metric much less to the performance metric at granular levels. I have followed this instruction But I cannot seem to figure out what is evaluation_id (also see the official sample code snippet here), which is:

def get_model_evaluation_image_object_detection_sample(
    project: str,
    model_id: str,
    evaluation_id: str,
    location: str = "us-central1",
    api_endpoint: str = "us-central1-aiplatform.googleapis.com",
):
    # The AI Platform services require regional API endpoints.
    client_options = {"api_endpoint": api_endpoint}
    # Initialize client that will be used to create and send requests.
    # This client only needs to be created once, and can be reused for multiple requests.
    client = aiplatform.gapic.ModelServiceClient(client_options=client_options)
    name = client.model_evaluation_path(
        project=project, location=location, model=model_id, evaluation=evaluation_id
    )
    response = client.get_model_evaluation(name=name)
    print("response:", response)

After sometime I have figured out that for model trained in EU, and api_endpoint shall be passed as:

location: str = "europe-west4"
api_endpoint: str = "europe-west4-aiplatform.googleapis.com"

But whatever I try for evaluation_id leads to the following errors:

InvalidArgument: 400 List of found errors:  1.Field: name; Message: Invalid ModelEvaluation resource name.

There in the documentation it says (which is seems it contains what I need):

For the bounding box metric, Vertex AI returns an array of metric values at different IoU threshold values (between 0 and 1) and confidence threshold values (between 0 and 1). For example, you can narrow in on evaluation metrics at an IoU threshold of 0.85 and a confidence threshold of 0.8228. By viewing these different threshold values, you can see how they affect other metrics such as precision and recall.

Without knowing that is contained in the output array, how would that work for each class? Basically I need for each class the model metrics for varying IoU threshold values and confidence threshold.

Also I have tried to query from AutoML API instead, like:

client_options = {'api_endpoint': 'eu-automl.googleapis.com:443'}

client = automl.AutoMlClient(client_options=client_options)
# Get the full path of the model.
model_full_id = client.model_path(project_id, "europe-west4", model_id)

print("List of model evaluations:")
for evaluation in client.list_model_evaluations(parent=model_full_id, filter=""):
    print("Model evaluation name: {}".format(evaluation.name))
    print("Model annotation spec id: {}".format(evaluation.annotation_spec_id))
    print("Create Time: {}".format(evaluation.create_time))
    print("Evaluation example count: {}".format(evaluation.evaluated_example_count))
    print(
        "Classification model evaluation metrics: {}".format(
            evaluation.classification_evaluation_metrics
        )
    )

No surprise, also this doesn't work, and leads to:

InvalidArgument: 400 List of found errors:  1.Field: parent; Message: The provided location ID doesn't match the endpoint. For automl.googleapis.com, the valid location ID is `us-central1`. For eu-automl.googleapis.com, the valid location ID is `eu`.

Solution

I was able to get the response of the model evaluation using aiplatform_v1 which is well documented and this is the reference linked from the Vertex AI reference page.

On this script I ran list_model_evaluations() to get the evaluation name and used it as input for get_model_evaluation() which will return the evaluation details for Confidence Score Threshold, IoU Threshold, etc.

NOTE: I don't have a trained model in europe-west4 so I used us-central1 instead. But if you have trained in europe-west4 you should use https://europe-west4-aiplatform.googleapis.com as api_endpoint as per location document.

from google.cloud import aiplatform_v1 as aiplatform

api_endpoint = 'us-central1-aiplatform.googleapis.com'
client_options = {"api_endpoint": api_endpoint}
client_model = aiplatform.services.model_service.ModelServiceClient(client_options=client_options)
project_id = 'your-project-id'
location = 'us-central1'
model_id = '999999999999'

model_name = f'projects/{project_id}/locations/{location}/models/{model_id}'
list_eval_request = aiplatform.types.ListModelEvaluationsRequest(parent=model_name)
list_eval = client_model.list_model_evaluations(request=list_eval_request)
eval_name=''
for val in list_eval:
    eval_name = val.name

get_eval_request = aiplatform.types.GetModelEvaluationRequest(name=eval_name)
get_eval = client_model.get_model_evaluation(request=get_eval_request)
print(get_eval)

See response snippet:

name: "projects/xxxxxxxxx/locations/us-central1/models/999999999999/evaluations/1234567890"
metrics_schema_uri: "gs://google-cloud-aiplatform/schema/modelevaluation/image_object_detection_metrics_1.0.0.yaml"
metrics {
  struct_value {
    fields {
      key: "boundingBoxMeanAveragePrecision"
      value {
        number_value: 0.20201288
      }
    }
    fields {
      key: "boundingBoxMetrics"
      value {
        list_value {
          values {
            struct_value {
              fields {
                key: "confidenceMetrics"
                value {
                  list_value {
                    values {
                      struct_value {
                        fields {
                          key: "confidenceThreshold"
                          value {
                            number_value: 0.06579724
                          }
                        }
                        fields {
                          key: "f1Score"
                          value {
                            number_value: 0.15670435
                          }
                        }
                        fields {
                          key: "precision"
                          value {
                            number_value: 0.09326923
                          }
                        }
                        fields {
                          key: "recall"
                          value {
                            number_value: 0.48989898
                          }
                        }
                      }
                    }
                    values {
                      struct_value {
....

EDIT 1: Get response per class

To get metrics per class, you can use list_model_evaluation_slices() to get the name for each class, then use the name to get_model_evaluation_slice(). In this code I pushed the names to a list since I have multiple classes. Then just use the values stored in the array to get the metric per class.

In my code I used label[0] to get a single response from this class.

from google.cloud import aiplatform_v1 as aiplatform

api_endpoint = 'us-central1-aiplatform.googleapis.com'
client_options = {"api_endpoint": api_endpoint}
client_model = aiplatform.services.model_service.ModelServiceClient(client_options=client_options)
project_id = 'your-project-id'
location = 'us-central1'
model_id = '999999999999'

model_name = f'projects/{project_id}/locations/{location}/models/{model_id}'
list_eval_request = aiplatform.types.ListModelEvaluationsRequest(parent=model_name)
list_eval = client_model.list_model_evaluations(request=list_eval_request)
eval_name=''
for val in list_eval:
    eval_name = val.name

label=[]
slice_eval_request = aiplatform.types.ListModelEvaluationSlicesRequest(parent=eval_name)
slice_eval = client_model.list_model_evaluation_slices(request=slice_eval_request)
for data in slice_eval:
    label.append(data.name)

get_eval_slice_request = aiplatform.types.GetModelEvaluationSliceRequest(name=label[0])
get_eval_slice = client_model.get_model_evaluation_slice(request=get_eval_slice_request)
print(get_eval_slice)

Print all classes:

Classes in UI:

Response snippet for a class:

name: "projects/xxxxxxxxx/locations/us-central1/models/999999999/evaluations/0000000000/slices/777777777"
slice_ {
  dimension: "annotationSpec"
  value: "Cheese"
}
metrics_schema_uri: "gs://google-cloud-aiplatform/schema/modelevaluation/image_object_detection_metrics_1.0.0.yaml"
metrics {
  struct_value {
    fields {
      key: "boundingBoxMeanAveragePrecision"
      value {
        number_value: 0.14256561
      }
    }
    fields {
      key: "boundingBoxMetrics"
      value {
        list_value {
          values {
            struct_value {
              fields {
                key: "confidenceMetrics"
                value {
                  list_value {
                    values {
                      struct_value {
                        fields {
                          key: "confidenceThreshold"
                          value {
                            number_value: 0.06579724
                          }
                        }
                        fields {
                          key: "f1Score"
                          value {
                            number_value: 0.10344828
                          }
                        }
                        fields {
                          key: "precision"
                          value {
                            number_value: 0.06198347
                          }
                        }
....