I have the following code:
import h2o
from h2o.estimators.gbm import H2OGradientBoostingEstimator
from h2o.grid.grid_search import H2OGridSearch
h2o.init()
data=h2o.import_file('dataset.csv')
train,test= train.split_frame(ratios=[0.8])
n_trees = [50, 100, 200, 300]
max_depth = [5, 6, 7]
learn_rate = [0.01, 0.05, 0.1]
min_rows = [10,15,20]
min_split_improvement = [0.00001, 0.0001]
hyper_parameters = {"ntrees":n_trees,
"max_depth":max_depth,
"learn_rate":learn_rate,
"min_rows":min_rows}
gs=H2OGridSearch(model=H2OGradientBoostingEstimator, hyper_params=hyper_parameters)
gs.train(x=train.columns, y=target_column, training_frame=train, validation_frame=test, distribution='bernoulli')
grid_perf=gs.get_grid(sort_by='auc',decreasing=True)
This produces a grid search of GBMs on the dataset. I want to be able to save the result of the grid search, grid_perf, as a csv.
Something along the lines of:
h2o.export_file(grid_perf,'grid_search_results.csv')
Note: the code above works, so no debugging necessary, thanks.
Tried using the above line, but it gives me a Argument python_obj should be a None | list | tuple | dict | numpy.ndarray | pandas.DataFrame | scipy.sparse.issparse, got H2OGridSearch
error.
Thanks to Adam Valenta for the suggestion. Using that, the solution is:
grid_perf=gs.get_grid(sort_by='auc', decreasing=True)
table = grid_perf._grid_json['summary_table'].as_data_frame()
table.to_csv('GridSearch1.csv',index=False)