Is there a better way to reload a hydra config from an experiment with enumerations? Right now I reload it like so:
initialize_config_dir(config_dir=exp_dir, ".hydra"), job_name=config_name)
cfg = compose(config_name, overrides=overrides)
print(cfg.enum)
>>> ENUM1
But ENUM1 is actually an enumeration that normally loads as
>>> <SomeEnumClass.ENUM1: 'enum1'>
I am able to fix this by adding a configstore default to the experiment hydra file:
defaults:
- base_config_cs
Which now results in
initialize_config_dir(config_dir=exp_dir, ".hydra"), job_name=config_name)
cfg = compose(config_name, overrides=overrides)
print(cfg.enum)
>>> <SomeEnumClass.ENUM1: 'enum1'>
Is there a better way to do this without adding this? Or can I add the default in the python code?
This is a good question -- reliably reloading configs from previous Hydra runs is an area that could be improved.
As you've discovered, loading the saved file config.yaml
directly results in an untyped DictConfig object.
The solution below involves a script called reload.py
that creates a config node with a defaults list that loads both the schema base_config_cs
and the saved file config.yaml
.
At the end of this post I also give a simple solution that involves loading .hydra/overrides.yaml
to re-run the config composition process.
Suppose you've run a Hydra job with the following setup:
# app.py
from dataclasses import dataclass
from enum import Enum
import hydra
from hydra.core.config_store import ConfigStore
from omegaconf import DictConfig
class SomeEnumClass(Enum):
ENUM1 = 1
ENUM2 = 2
@dataclass
class Schema:
enum: SomeEnumClass
x: int = 123
y: str = "abc"
def store_schema() -> None:
cs = ConfigStore.instance()
cs.store(name="base_config_cs", node=Schema)
@hydra.main(config_path=".", config_name="foo")
def app(cfg: DictConfig) -> None:
print(cfg)
if __name__ == "__main__":
store_schema()
app()
# foo.yaml
defaults:
- base_config_cs
- _self_
enum: ENUM1
x: 456
$ python app.py y=xyz
{'enum': <SomeEnumClass.ENUM1: 1>, 'x': 456, 'y': 'xyz'}
After running app.py
, there exists a directory outputs/2022-02-05/06-42-42/.hydra
containing the saved file config.yaml
.
As you correctly pointed out in your question, to reload the saved config you must merge the schema base_config_cs
with the contents of config.yaml
. Here is a pattern for accomplishing that:
# reload.py
import os
from hydra import compose, initialize_config_dir
from hydra.core.config_store import ConfigStore
from app import store_schema
config_name = "config"
exp_dir = os.path.abspath("outputs/2022-02-05/07-19-56")
saved_cfg_dir = os.path.join(exp_dir, ".hydra")
assert os.path.exists(f"{saved_cfg_dir}/{config_name}.yaml")
store_schema() # stores `base_config_cs`
cs = ConfigStore.instance()
cs.store(
name="reload_conf",
node={
"defaults": [
"base_config_cs",
config_name,
]
},
)
with initialize_config_dir(config_dir=saved_cfg_dir):
cfg = compose("reload_conf")
print(cfg)
$ python reload.py
{'enum': <SomeEnumClass.ENUM1: 1>, 'x': 456, 'y': 'xyz'}
In the above, python file reload.py
, we store a node called reload_conf
in the ConfigStore. Storing reload_conf
this way is equivalent to creating a file called reload_conf.yaml
that is discoverable by Hydra on the config search path. This reload_conf
node has a defaults list that loads both the schema base_config_cs
and config
. For this to work, the following two conditions must be met:
base_config_cs
must be stored in the ConfigStore. This is accomplished by calling the store_schema
function that we have imported from app.py
.config_name
, i.e. config.yaml
in this example, must be discoverable by Hydra (which is taken care of here by calling initialize_config_dir
).Note that in foo.yaml
we have a defaults list ["base_config_cs", "_self_"]
that loads the schema base_config_cs
before loading the contents _self_
of foo
. In order for reload_conf
to reconstruct the app's config with the same merge order, base_config_cs
should come before config_name
in the defaults list belonging to reload_conf
.
The above approach could be taken one step further by removing the defaults list from foo.yaml
and using cs.store
to ensure the same defaults list is used in both the app and the reloading script
# app2.py
from dataclasses import dataclass
from enum import Enum
from typing import Any, List
import hydra
from hydra.core.config_store import ConfigStore
from omegaconf import MISSING, DictConfig
class SomeEnumClass(Enum):
ENUM1 = 1
ENUM2 = 2
@dataclass
class RootConfig:
defaults: List[Any] = MISSING
enum: SomeEnumClass = MISSING
x: int = 123
y: str = "abc"
def store_root_config(primary_config_name: str) -> None:
cs = ConfigStore.instance()
# defaults list defined here:
cs.store(
name="root_config", node=RootConfig(defaults=["_self_", primary_config_name])
)
@hydra.main(config_path=".", config_name="root_config")
def app(cfg: DictConfig) -> None:
print(cfg)
if __name__ == "__main__":
store_root_config("foo2")
app()
# foo2.yaml (note NO DEFAULTS LIST)
enum: ENUM1
x: 456
$ python app2.py hydra.job.chdir=false y=xyz
{'enum': <SomeEnumClass.ENUM1: 1>, 'x': 456, 'y': 'xyz'}
# reload2.py
import os
from hydra import compose, initialize_config_dir
from hydra.core.config_store import ConfigStore
from app2 import store_root_config
config_name = "config"
exp_dir = os.path.abspath("outputs/2022-02-05/07-45-43")
saved_cfg_dir = os.path.join(exp_dir, ".hydra")
assert os.path.exists(f"{saved_cfg_dir}/{config_name}.yaml")
store_root_config("config")
with initialize_config_dir(config_dir=saved_cfg_dir):
cfg = compose("root_config")
print(cfg)
$ python reload2.py
{'enum': <SomeEnumClass.ENUM1: 1>, 'x': 456, 'y': 'xyz'}
A simpler alternative approach is to use .hydra/overrides.yaml
to recompose the app's configuration based on the overrides that were originally passed to Hydra:
# reload3.py
import os
import yaml
from hydra import compose, initialize
from app import store_schema
config_name = "config"
exp_dir = os.path.abspath("outputs/2022-02-05/07-19-56")
saved_cfg_dir = os.path.join(exp_dir, ".hydra")
overrides_path = f"{saved_cfg_dir}/overrides.yaml"
assert os.path.exists(overrides_path)
overrides = yaml.unsafe_load(open(overrides_path, "r"))
print(f"{overrides=}")
store_schema()
with initialize(config_path="."):
cfg = compose("foo", overrides=overrides)
print(cfg)
$ python reload3.py
overrides=['y=xyz']
{'enum': <SomeEnumClass.ENUM1: 1>, 'x': 456, 'y': 'xyz'}
This approach has its drawbacks: if your app's configuration involves some non-hermetic operation like querying a timestamp (e.g. via Hydra's now
resolver) or looking up an environment variable (e.g. via the oc.env
resolver), the configuration composed by reload.py
might be different from the original version loaded in app.py
.