clean-up in config.py focusing on shared path #1579

sbAsma · 2026-01-12T09:10:52Z

Description

Issue Number

Closes #1061

Is this PR a draft? Mark it as draft.

Checklist before asking for review

I have performed a self-review of my code
My changes comply with basic sanity checks:
- I have fixed formatting issues with ./scripts/actions.sh lint
- I have run unit tests with ./scripts/actions.sh unit-test
- I have documented my code and I have updated the docstrings.
- I have added unit tests, if relevant
I have tried my changes with data and code:
- I have run the integration tests with ./scripts/actions.sh integration-test
- (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
- (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
I have informed and aligned with people impacted by my change:
- for config changes: the MatterMost channels and/or a design doc
- for changes of dependencies: the MatterMost software development channel

…nfig_attribute()

grassesi

Great work, at some points there is potential for even more simplification. Please have a look at my suggestions.

grassesi · 2026-01-13T15:27:05Z

packages/common/src/weathergen/common/config.py

+# Cache the expensive private config loading operation
+_shared_wg_base_path = None
+
+
+def _get_shared_wg_base_path() -> Path:
+    """Get the shared working directory base path, cached after first call."""
+    global _shared_wg_base_path
+    if _shared_wg_base_path is None:
+        pcfg = _load_private_conf()
+        _shared_wg_base_path = Path(pcfg.get("path_shared_working_dir"))
+    return _shared_wg_base_path


python implements a super simple mechanism to cache results: You can annotate a method with @cache:

@cache def _get_shared_wg_base_path() -> Path: """Get the shared working directory base path, cached after first call.""" private_config = _load_private_conf() return Path(private_config.get("path_shared_working_dir"))

grassesi · 2026-01-13T15:36:53Z

packages/common/src/weathergen/common/config.py


-def set_paths(config: Config) -> Config:
-    """Set the configs run_path model_path attributes to default values if not present."""
-    config = config.copy()
-    config.run_path = _get_config_attribute(
-        config=config, attribute_name="run_path", fallback="results"
-    )
-    config.model_path = _get_config_attribute(
-        config=config, attribute_name="model_path", fallback="models"
-    )
-
-    return config
-
-
-def _get_config_attribute(config: Config, attribute_name: str, fallback: str) -> str:
-    """Get an attribute from a Config. If not available, fall back to path_shared_working_dir
-    concatenated with the desired fallback path. Raise an error if neither the attribute nor a
-    fallback is specified."""
-    attribute = OmegaConf.select(config, attribute_name)
-    fallback_root = OmegaConf.select(config, "path_shared_working_dir")
-    assert attribute is not None or fallback_root is not None, (
-        f"Must specify `{attribute_name}` in config if `path_shared_working_dir` is None in config"
-    )
-    attribute = attribute if attribute else fallback_root + fallback
-    return attribute
-
-


nice, I am glad that this confusing logic is gone.

grassesi · 2026-01-15T07:11:14Z

src/weathergen/utils/train_logger.py


-        result_dir_base = Path(cf.run_path)
+        result_dir_base = config._get_shared_wg_path("results")
        result_dir = result_dir_base / run_id


please use result_dir = config.get_path_run(cf) here

grassesi · 2026-01-15T08:24:34Z

src/weathergen/utils/train_logger.py

        metrics_path = get_train_metrics_path(
-            base_path=Path(self.cf.run_path), run_id=self.cf.run_id
+            base_path=config._get_shared_wg_path("results"), run_id=self.cf.run_id
        )


Please use config.get_path_run(self.cf) here.

grassesi · 2026-01-15T08:24:50Z

src/weathergen/utils/train_logger.py


-        result_dir_base = Path(cf.run_path)
+        result_dir_base = config._get_shared_wg_path("results")
        result_dir = result_dir_base / run_id


grassesi · 2026-01-15T08:37:48Z

packages/common/src/weathergen/common/config.py

    """
-    pcfg = _load_private_conf()
-    return Path(pcfg.get("path_shared_working_dir")) / local_path
+    return _get_shared_wg_base_path() / local_path


There is no need for this method. Everywhere outside of this module config.get_path_...(cf) should be used. Inside this module this method is just a more indirect way of saying directly _get_shared_wg_base_path() / local_path. Please remove this method and instead rename _get_shared_wg_base_path() to _get_shared_wg_path(). Use this method then to implement get_path_run and get_path_model

grassesi · 2026-01-15T08:43:48Z

packages/common/src/weathergen/common/config.py

+            model_path = str(_get_shared_wg_path("models"))
        path = Path(model_path)


Use model_path = _get_shared_wg_path() / "models" here. model_path does not need to be a str, see my comments on get_shared_wg_path.

If you implement the suggestion of my comment on get_path_model you could even do model_path = get_path_model(run_id=run_id).

grassesi · 2026-01-15T09:01:46Z

packages/common/src/weathergen/common/config.py

 def get_path_model(config: Config) -> Path:
    """Get the current runs model_path for storing model checkpoints."""
-    return Path(config.model_path) / config.run_id
+    model_path = _get_shared_wg_path("models")
+    return model_path / config.run_id


Would be cool if this method accepts either a config object or a run_id directly eg:

def get_path_model(config: Config | None = None, run_id: str | None) -> Path: if config or run_id: run_id = run_id if run_id else config.run_id else: msg = f"Missing run_id and cannot infer it from config: {config}" raise ValueError(msg) return _get_shared_wg_path() / "models" / run_id

Then we could use it in load_run_config / _get_model_config_file_read_name: get_path_model(run_id=run_id)

grassesi · 2026-01-15T09:10:10Z

packages/common/src/weathergen/common/config.py

-    dirname = path_models / config.run_id
+    dirname = get_path_model(config)
    dirname.mkdir(exist_ok=True, parents=True)

-    fname = _get_model_config_file_write_name(path_models, config.run_id, mini_epoch)
+    path_models_parent = dirname.parent
+    fname = _get_model_config_file_write_name(path_models_parent, config.run_id, mini_epoch)


please adjust _get_model_config_file_write_name and get_model_config_file_read_name to only return the filename not the entire path of the config_file eg:

dirname = get_path_model(config) dirname.mkdir(exist_ok=True, parents=True) fname = _get_model_config_file_write_name(config.run_id, mini_epoch) json_str = json.dumps(OmegaConf.to_container(_strip_interpolation(config))) with (dirname/fname).open("w") as f: f.write(json_str)

grassesi · 2026-01-15T09:11:48Z

packages/common/src/weathergen/common/config.py

-        base_config = load_run_config(
-            from_run_id, mini_epoch, private_config.get("model_path", None)
-        )
+        base_config = load_run_config(from_run_id, mini_epoch, _get_shared_wg_path("models"))


no need for _get_shared_wg_path() / "models" here since it will be retrieved again in load_run_config anyway.

- Simplify get_path_model/get_path_run to always resolve via _get_shared_wg_path() - Change _get_shared_wg_path() to cached, argument-free helper returning the shared working dir from private config - Adjust model config save/load to build filenames relative to the run’s model directory instead of passing parent paths around - Update load_run_config and load_merge_configs to use new path helpers and improve assertion/log messages - Replace internal _get_shared_wg_path("results") usages with get_path_run() in wegen_reader and train_logger

sbAsma added 3 commits January 12, 2026 05:43

caching get_shared_wg_path()

360acc9

renaming get_path_output to get_path_results

3c07653

model and results paths from get_shared_wg_path() and removed _get_co…

d4c690d

…nfig_attribute()

github-project-automation bot added this to WeatherGen-dev Jan 12, 2026

clessig requested a review from grassesi January 12, 2026 09:13

sbAsma added 6 commits January 13, 2026 04:18

marking get_shared_wg_path() as private

1eb724f

removing set_path()

79f41bc

fixed call to _get_shared_wg_path

0e60a16

fixed import, code clean-up, change caching decorator

b39fb86

changed way of caching _get_shared_wg_base_path

f4a37e9

fixed typing error

7e34a85

grassesi requested changes Jan 15, 2026

View reviewed changes

github-project-automation bot moved this to In Progress in WeatherGen-dev Jan 15, 2026

		model_path = str(_get_shared_wg_path("models"))
		path = Path(model_path)

clean-up in config.py focusing on shared path #1579

Are you sure you want to change the base?

clean-up in config.py focusing on shared path #1579

Uh oh!

Conversation

sbAsma commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Issue Number

Checklist before asking for review

Uh oh!

grassesi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sbAsma commented Jan 12, 2026 •

edited

Loading