I understand that sklearn
requires categorical features to be encoded to dummy variables or one-hot encoded when running the sklearn.ensemble.RandomForestRegressor
method, and that XGBoost
requires the same, but h2o
permitted raw categorical features to be used in its h2o.estimators.random_forest.H2ORandomForestEstimator
method. Since h2o4gpu
's implementation of random forest is built on top of XGBoost
, does this mean support for raw categorical features is not included?
There is no native support for categorical columns in h2o4gpu (at least yet), so you will have to one-hot encode (or label encode) your categorical columns like you do in sklearn and xgboost.