Skip to content

DAOS-16362 pydaos: ensure checkpoint path is created#17489

Open
0xE0F wants to merge 3 commits intomasterfrom
0xe0f/pytorch-ensure-checkpoint-path
Open

DAOS-16362 pydaos: ensure checkpoint path is created#17489
0xE0F wants to merge 3 commits intomasterfrom
0xe0f/pytorch-ensure-checkpoint-path

Conversation

@0xE0F
Copy link
Contributor

@0xE0F 0xE0F commented Feb 2, 2026

Some use of Checkpoint assumes that path to the checkpoiunt file will be created with all missing parent directories.

For instance, DLIO benchmark writes checkpoints as /prefix/global_epochX_stepY/layer-Z.pt.

This commit adds ensure_path parameter to call mkdirall before writing checkpoint file.

Features: pytorch

@0xE0F 0xE0F marked this pull request as ready for review February 2, 2026 22:15
@0xE0F 0xE0F requested review from a team as code owners February 2, 2026 22:15
@github-actions
Copy link

github-actions bot commented Feb 2, 2026

Ticket title is 'pytorch checkpoint module'
Status is 'Resolved'
https://daosio.atlassian.net/browse/DAOS-16362

@daosbuild3
Copy link
Collaborator

@0xE0F 0xE0F force-pushed the 0xe0f/pytorch-ensure-checkpoint-path branch from b356ea2 to bd6677d Compare February 4, 2026 04:03
@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

Denis Barakhtanov added 3 commits February 6, 2026 08:18
Some use of Checkpoint assumes that path to the checkpoiunt file will be
created with all missing parent directories.

For instance, DLIO benchmark writes checkpoints as `/prefix/global_epochX_stepY/layer-Z.pt`.

This commit adds `ensure_path` parameter to call `mkdirall` before
writing checkpoint file.

Features: pytorch
Signed-off-by: Denis Barakhtanov <dbarahtanov@enakta.com>
Features: pytorch
Signed-off-by: Denis Barakhtanov <dbarahtanov@enakta.com>
Signed-off-by: Denis Barakhtanov <dbarahtanov@enakta.com>
@0xE0F 0xE0F force-pushed the 0xe0f/pytorch-ensure-checkpoint-path branch from eb23a4c to 951b03b Compare February 5, 2026 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants