pythontensorflowkerastf.kerastensorflow-decision-forests

tensorflow keras RandomForestModel get_config() is empty


I want to be able to review the hyperparameters passed to keras' RandomForestModel. I think this should be possible with model.get_config(). However, after creating and training the model, get_config() always returns an empty dictionary.

This is the function that creates the model in my RandomForestWrapper class:

def add_new_model(self, model_name, params):

    self.train_test_split()

        model = tfdf.keras.RandomForestModel(
            random_seed=params["random_seed"],
            num_trees=params["num_trees"],
            categorical_algorithm=params["categorical_algorithm"],
            compute_oob_performances=params["compute_oob_performances"],
            growing_strategy=params["growing_strategy"],
            honest=params["honest"],
            max_depth=params["max_depth"],
            max_num_nodes=params["max_num_nodes"]
           )

    print(model.get_config())
    self.models.update({model_name: model})
    print(f"{model_name} added")

Example parameters:

params_v2 = {
    "random_seed": 123456,
    "num_trees": 1000,
    "categorical_algorithm": "CART",
    "compute_oob_performances": True,
    "growing_strategy": "LOCAL",
    "honest": True,
    "max_depth": 8,
    "max_num_nodes": None
}

I then instantiate the class and train the model:

rf_models = RF(data, obs_col="obs", class_col="cell_type")
rf_models.add_new_model("model_2", params_v2)
rf_models.train_model("model_2", verbose=False, metrics=["Accuracy"])

model = rf_models.models["model_2"]
model.get_config()

##
{}

In the model summary I can see that the parameters are accepted.


Solution

  • Regarding get_config(), notice what the docs state:

    Returns the config of the Model.

    Config is a Python dictionary (serializable) containing the configuration of an object, which in this case is a Model. This allows the Model to be be reinstantiated later (without its trained weights) from this configuration.

    Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

    Developers of subclassed Model are advised to override this method, and continue to update the dict from super(MyModel, self).get_config() to provide the proper configuration of this Model. The default config is an empty dict. Optionally, raise NotImplementedError to allow Keras to attempt a default serialization.

    I think what you can do is just call model.learner_params to get the details you want:

    import tensorflow_decision_forests as tfdf
    import pprint
    
    params_v2 = {
        "random_seed": 123456,
        "num_trees": 1000,
        "categorical_algorithm": "CART",
        "compute_oob_performances": True,
        "growing_strategy": "LOCAL",
        "honest": True,
        "max_depth": 8,
        "max_num_nodes": None
    }
    
    model = tfdf.keras.RandomForestModel().from_config(params_v2)
    pprint.pprint(model.learner_params)
    
    {'adapt_bootstrap_size_ratio_for_maximum_training_duration': False,
     'allow_na_conditions': False,
     'bootstrap_size_ratio': 1.0,
     'bootstrap_training_dataset': True,
     'categorical_algorithm': 'CART',
     'categorical_set_split_greedy_sampling': 0.1,
     'categorical_set_split_max_num_items': -1,
     'categorical_set_split_min_item_frequency': 1,
     'compute_oob_performances': True,
     'compute_oob_variable_importances': False,
     'growing_strategy': 'LOCAL',
     'honest': True,
     'honest_fixed_separation': False,
     'honest_ratio_leaf_examples': 0.5,
     'in_split_min_examples_check': True,
     'keep_non_leaf_label_distribution': True,
     'max_depth': 8,
     'max_num_nodes': None,
     'maximum_model_size_in_memory_in_bytes': -1.0,
     'maximum_training_duration_seconds': -1.0,
     'min_examples': 5,
     'missing_value_policy': 'GLOBAL_IMPUTATION',
     'num_candidate_attributes': 0,
     'num_candidate_attributes_ratio': -1.0,
     'num_oob_variable_importances_permutations': 1,
     'num_trees': 1000,
     'pure_serving_model': False,
     'random_seed': 123456,
     'sampling_with_replacement': True,
     'sorting_strategy': 'PRESORT',
     'sparse_oblique_normalization': None,
     'sparse_oblique_num_projections_exponent': None,
     'sparse_oblique_projection_density_factor': None,
     'sparse_oblique_weights': None,
     'split_axis': 'AXIS_ALIGNED',
     'uplift_min_examples_in_treatment': 5,
     'uplift_split_score': 'KULLBACK_LEIBLER',
     'winner_take_all': True}