rh2oh2o.ai

Feature Standardize in AutoML H2O


I'm wondering how to standardize features when using h2o's AutoML with deep learning and GLM algorithms.

Seems it is supported to deep learning and GLM models (https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/standardize.html), but in h2o.automl it does not accept the standardize = TRUE argument.

My questions are:

  1. Does autoML automatically scales (i.e. standardizes) the features when deeplearning or GLM algorithm is used?

If true, does it automatically standardize also when I predict on new test data?

  1. If 1) is not true, is there a built-in h2o function that achieves this so that I can do it manually? What's the recommended workflow for this with AutoML?

Solution

    1. Yes, H2O AutoML uses most of the hyperparameter defaults in GLM and Deep Learning, and both of those default to standardize = TRUE.

    2. In H2O, every transformation that happens in training will happen at predict time, so you don't need to worry.