Skip to content

Commit 31f1066

Browse files
committed
adds format utils
1 parent f4ecb4c commit 31f1066

File tree

1 file changed

+31
-0
lines changed

1 file changed

+31
-0
lines changed

recipes_source/distributed_checkpoint_recipe.rst

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -320,6 +320,37 @@ the intent is to save or load in "non-distributed" style, meaning entirely in th
320320
run_checkpoint_load_example()
321321
322322
323+
Formats
324+
----------
325+
One drawback not yet mentioned is that DCP saves checkpoints in a format which is inherently different then those generated using torch.save.
326+
Since this can be an issue when users wish to share models with users used to the torch.save format, or in general just want to add format flexibility
327+
to their applications. For this case, we provide the `format_utils` module in `torch.distributed.checkpoint.format_utils`.
328+
329+
A command line utility is provided for the users convenience, which follows the following format:
330+
`python -m torch.distributed.checkpoint.format_utils -m <checkpoint location> <location to write formats to> <mode>` where mode is one of `torch_to_dcp` or `dcp_to_torch`.
331+
332+
Alternatively, methods are also provided for users who may wish to convert checkpoints directly.
333+
334+
335+
.. code-block:: python
336+
337+
import os
338+
339+
import torch
340+
import torch.distributed.checkpoint as DCP
341+
from torch.distributed.checkpoint.format_utils import dcp_to_torch_save, torch_save_to_dcp
342+
343+
CHECKPOINT_DIR = "checkpoint"
344+
TORCH_SAVE_CHECKPOINT_DIR = "torch_save_checkpoint.pth"
345+
346+
# convert dcp model to torch.save (assumes checkpoint was generated as above)
347+
dcp_to_torch_save(CHECKPOINT_DIR, TORCH_SAVE_CHECKPOINT_DIR)
348+
349+
# converts the torch.save model back to DCP
350+
dcp_to_torch_save(TORCH_SAVE_CHECKPOINT_DIR, f"{CHECKPOINT_DIR}_new")
351+
352+
353+
323354
Conclusion
324355
----------
325356
In conclusion, we have learned how to use DCP's :func:`save` and :func:`load` APIs, as well as how they are different form :func:`torch.save` and :func:`torch.load`.

0 commit comments

Comments
 (0)