pythonhyperparametersoptunafb-hydrahydra

Store user attributes in Optuna Sweeper plugin for Hydra


How can I store additional information in an optuna trial when using it via the Hydra sweep plugin?

My use case is as follows: I want to optimize a bunch of hyperparameters. I am storing all reproducibility information of all experiments (i.e., trials) in a separate database. I know I can get the best values via optuna.load_study().best_params or even best_trial. However, that only allows me to replicate the experiment - potentially this takes quite some time. To overcome this issue, I need to somehow link it to my own database. I would like to store the ID of my own database somewhere in the trial object.

Without using Hydra, I suppose I'd set User Attributes. However, with Hydra abstracting all that away, there seems no option to do so.

I know that I can just query my own database for the exact combination of best params that optuna found, but that just seems like a difficult solution to a simple problem.

Some minimal code:

from dataclasses import dataclass

import hydra
from hydra.core.config_store import ConfigStore
from omegaconf import MISSING


@dataclass
class TrainConfig:
    x: float | int = MISSING
    y: int = MISSING
    z: int | None = None


ConfigStore.instance().store(name="config", node=TrainConfig)


@hydra.main(version_base=None, config_path="conf", config_name="sweep")
def sphere(cfg: TrainConfig) -> float:
    x: float = cfg.x
    y: float = cfg.y
    return x**2 + y**2


if __name__ == "__main__":
    sphere()
defaults:
  - override hydra/sweeper: optuna
  - override hydra/sweeper/sampler: tpe

hydra:
  sweeper:
    sampler:
      seed: 123
    direction: minimize
    study_name: sphere
    storage: sqlite:///trials.db
    n_trials: 20
    n_jobs: 1
    params:
      x: range(-5.5, 5.5, step=0.5)
      y: choice(-5 ,0 ,5)
      z: choice(0, 3, 5)

x: 1
y: 1
z: 1

Solution

  • A hacky solution via the custom_search_space.

    hydra:
      sweeper:
        sampler:
          seed: 123
        direction: minimize
        study_name: sphere
        storage: sqlite:///trials.db
        n_trials: 20
        n_jobs: 1
        params:
          x: range(-5.5, 5.5, step=0.5)
          y: choice(-5 ,0 ,5)
          z: choice([0, 1], [2, 3], [2, 5])
        custom_search_space: package.run.configure
    
    def configure(_, trial: Trial) -> None:
        trial.set_user_attr("experiment_db_id", 123456)