How can I store additional information in an optuna trial
when using it via the Hydra sweep plugin?
My use case is as follows:
I want to optimize a bunch of hyperparameters. I am storing all reproducibility information of all experiments (i.e., trials) in a separate database.
I know I can get the best values via optuna.load_study().best_params
or even best_trial
. However, that only allows me to replicate the experiment - potentially this takes quite some time. To overcome this issue, I need to somehow link it to my own database. I would like to store the ID of my own database somewhere in the trial
object.
Without using Hydra, I suppose I'd set User Attributes. However, with Hydra abstracting all that away, there seems no option to do so.
I know that I can just query my own database for the exact combination of best params that optuna found, but that just seems like a difficult solution to a simple problem.
Some minimal code:
from dataclasses import dataclass
import hydra
from hydra.core.config_store import ConfigStore
from omegaconf import MISSING
@dataclass
class TrainConfig:
x: float | int = MISSING
y: int = MISSING
z: int | None = None
ConfigStore.instance().store(name="config", node=TrainConfig)
@hydra.main(version_base=None, config_path="conf", config_name="sweep")
def sphere(cfg: TrainConfig) -> float:
x: float = cfg.x
y: float = cfg.y
return x**2 + y**2
if __name__ == "__main__":
sphere()
defaults:
- override hydra/sweeper: optuna
- override hydra/sweeper/sampler: tpe
hydra:
sweeper:
sampler:
seed: 123
direction: minimize
study_name: sphere
storage: sqlite:///trials.db
n_trials: 20
n_jobs: 1
params:
x: range(-5.5, 5.5, step=0.5)
y: choice(-5 ,0 ,5)
z: choice(0, 3, 5)
x: 1
y: 1
z: 1
A hacky solution via the custom_search_space
.
hydra:
sweeper:
sampler:
seed: 123
direction: minimize
study_name: sphere
storage: sqlite:///trials.db
n_trials: 20
n_jobs: 1
params:
x: range(-5.5, 5.5, step=0.5)
y: choice(-5 ,0 ,5)
z: choice([0, 1], [2, 3], [2, 5])
custom_search_space: package.run.configure
def configure(_, trial: Trial) -> None:
trial.set_user_attr("experiment_db_id", 123456)