I am using pydantic to manage settings for an app that supports different datasets. Each has a set of overridable defaults, but they are different per datasets. Currently, I have all of the logic correctly implemented via validators:
from pydantic import BaseModel
class DatasetSettings(BaseModel):
dataset_name: str
table_name: str
@validator("table_name", always=True)
def validate_table_name(cls, v, values):
if isinstance(v, str):
return v
if values["dataset_name"] == "DATASET_1":
return "special_dataset_1_default_table"
if values["dataset_name"] == "DATASET_2":
return "special_dataset_2_default_table"
return "default_table"
class AppSettings(BaseModel):
dataset_settings: DatasetSettings
app_url: str
This way, I get different defaults based on dataset_name
, but the user can override them if necessary. This is the desired behavior. The trouble is that once there are more than a handful of such fields and names, it gets to be a mess to read and to maintain. It seems like inheritance/polymorphism would solve this problem but the pydantic factory logic seems too hardcoded to make it feasible, especially with nested models.
class Dataset1Settings(DatasetSettings):
dataset_name: str = "DATASET_1"
table_name: str = "special_dataset_1_default_table"
class Dataset2Settings(DatasetSettings):
dataset_name: str = "DATASET_2"
table_name: str = "special_dataset_2_default_table"
def dataset_settings_factory(dataset_name, table_name=None):
if dataset_name == "DATASET_1":
return Dataset1Settings(dataset_name, table_name)
if dataset_name == "DATASET_2":
return Dataset2Settings(dataset_name, table_name)
return DatasetSettings(dataset_name, table_name)
class AppSettings(BaseModel):
dataset_settings: DatasetSettings
app_url: str
Options I've considered:
__init__
of DatasetSettings
, instantiate the subclass and copy its attributes into the parent class. Kind of clunky.__init__
of AppSettings
using the dataset_settings_factory
to set the dataset_settings
attribute of AppSettings
. Not so good because the default behavior doesn't work in the DatasetSettings
at all, only when instantiated as a nested model in AppSettings
.I was hoping Field(default_factory=dataset_settings_factory)
would work, but the default_factory
is only for actual defaults so it has zero args. Is there some other way to intercept the args of a particular pydantic field and use a custom factory?
I ended up solving the problem following the first option, as follows. Code is runnable with pydantic 1.8.2 and pydantic 1.9.1.
from typing import Optional
from pydantic import BaseModel, Field
class DatasetSettings(BaseModel):
dataset_name: Optional[str] = Field(default="DATASET_1")
table_name: Optional[str] = None
def __init__(self, **data):
factory_dict = {"DATASET_1": Dataset1Settings, "DATASET_2": Dataset2Settings}
dataset_name = (
data["dataset_name"]
if "dataset_name" in data
else self.__fields__["dataset_name"].default
)
if dataset_name in factory_dict:
data = factory_dict[dataset_name](**data).dict()
super().__init__(**data)
class Dataset1Settings(BaseModel):
dataset_name: str = "DATASET_1"
table_name: str = "special_dataset_1_default_table"
class Dataset2Settings(BaseModel):
dataset_name: str = "DATASET_2"
table_name: str = "special_dataset_2_default_table"
class AppSettings(BaseModel):
dataset_settings: DatasetSettings = Field(default_factory=DatasetSettings)
app_url: Optional[str]
app_settings = AppSettings(dataset_settings={"dataset_name": "DATASET_1"})
assert app_settings.dataset_settings.table_name == "special_dataset_1_default_table"
app_settings = AppSettings(dataset_settings={"dataset_name": "DATASET_2"})
assert app_settings.dataset_settings.table_name == "special_dataset_2_default_table"
# bonus: no args mode
app_settings = AppSettings()
assert app_settings.dataset_settings.table_name == "special_dataset_1_default_table"
A couple of gotchas I discovered along the way:
Dataset1Settings
inherits from DatasetSettings
, it enters a recursive loop calling init on init ad infinitum. This could be broken with some introspection, but I opted for the duck approach.DatasetSettings
. I'm sure there's a way to call the validation logic anyway but the current solution effectively sidesteps whatever class-level validation you have by only initing with super().__init__
BaseSettings
objects, but you have to drag their cumbersome init args: def __init__(
self,
_env_file: Union[Path, str, None] = None,
_env_file_encoding: Optional[str] = None,
_secrets_dir: Union[Path, str, None] = None,
**values: Any
):
...