I am using a H2ORandomForestEsimator. What is the default target metric that H2O models use for their predict()
method?
https://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/modeling.html#h2o.automl.H2OAutoML.predict
Is there a way to set this? (Eg. to use one of the other metric maximizing thresholds that can be seen when looking at the results of get_params()
method)
Currently am doing something like...
df_preds = mymodel.predict(df)
activation_threshold = mymodel.find_threshold_by_max_metric('f1', valid=True)
# adjust the predicted label for the desired metric's maximizing threshold
df_preds['predict'] = df_preds['my_positive_class'].apply(lambda probability: 'my_positive_class' if probability >= activation_threshold else 'my_negative_class')
see
There's no concept of a "target metric" when generating predictions, since you're just predicting the response for a row of data (there's no scoring here).
Edit: Thanks for clarifying your question. If you want to change how the threshold is generated, then what you're doing above is a good solution. If you have a suggestion for a utility function that would make this more straight-forward, please file a JIRA with your idea (it could definitely be improved).