-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Description
Today I spoke to a user that had a very long and treacherous parameters.yml like this:
sensors:
sensor1:
name: "Sensor 1"
type: "temperature"
stderr: 0.1
sensor2:
name: "Sensor 2"
type: "humidity"
stderr: 0.1
sensor3:
name: "Sensor 3"
type: "temperature"
stderr: 0.1
sensor4:
name: "Sensor 4"
type: "temperature"
stderr: 0.2And so forth. So, there are several problems:
- There's lots of repetition. The user mentioned that it would be ideal to be able to "inherit" in YAML, something like:
_sensor:
type: "<unknown>"
stderr: 0.1
sensors:
sensor1: ${_sensor} # All defaults are taken
sensor2: ${_sensor}
stderr: 0.2 # Try to override default, but 💥 syntax error- It's unclear how to validate this YAML. We did a quick proof of concept combining OmegaConf and Pydantic v2:
from omegaconf import OmegaConf
from pydantic import BaseModel
class Sensor(BaseModel):
name: str
sensor_type: str = "<unknown>"
stderr: t.Optional[float] = 0.1
class Config(BaseModel):
sensors: t.Dict[str, Sensor]
config = OmegaConf.load("conf/base/parameters.yml")
c = Config.validate(config)
print(c.sensors["sensor3"].stderr)Which was cool! Because the defaults were filled from the Sensor model.
However, (2a) it's not clear how to keep the defaults in the YAML, which was desirable (although there's maybe a way to achieve that in Pydantic), (2b) it's not clear if this should be in parameters.yml or rather a custom sensors.yml, and most importantly, (2c) it's not clear how or where to perform such validation. There's no after_config_loaded hook.
I think the closest might be what kedro-mlflow does using after_context_created https://github.com/Galileo-Galilei/kedro-mlflow/blob/e88679938b1d4c7633c3f631f6b402ff11ab61fe/kedro_mlflow/framework/hooks/mlflow_hook.py#L78-L79 but then it's trying to inject the config in the KedroContext https://github.com/Galileo-Galilei/kedro-mlflow/blob/e88679938b1d4c7633c3f631f6b402ff11ab61fe/kedro_mlflow/framework/hooks/mlflow_hook.py#L129-L134, with all the problems discussed in #3214.
How can we better support this use case?
Paging @datajoely, @Galileo-Galilei