Skip to content

Commit eb7a303

Browse files
wanchaolpytorchmergebot
authored andcommitted
[dtensor] expose the __create_chunk_list__ in the doc (pytorch#144100)
as titled, this PR expose this dunder method as a public API in the doc, so that different checkpoint implementations can leverage this protocol, instead of exposing a separate API Pull Request resolved: pytorch#144100 Approved by: https://github.com/awgu ghstack dependencies: pytorch#144099
1 parent 45411d1 commit eb7a303

File tree

2 files changed

+11
-0
lines changed

2 files changed

+11
-0
lines changed

docs/source/distributed.tensor.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ on all devices, etc.
5151
.. autoclass:: DTensor
5252
:members: from_local, to_local, full_tensor, redistribute, device_mesh, placements
5353
:member-order: groupwise
54+
:special-members: __create_chunk_list__
5455

5556

5657
DeviceMesh as the distributed communicator

torch/distributed/tensor/_api.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -606,6 +606,16 @@ def __create_write_items__(self, fqn: str, object: Any):
606606
raise RuntimeError("Unsupported tensor type!")
607607

608608
def __create_chunk_list__(self):
609+
"""
610+
Return a list of ChunkStorageMetadata, which is a dataclass that describes the size/offset of the local shard/replica
611+
on current rank. For DTensor, each rank will have a single local shard/replica, so the returned list usually only
612+
has one element.
613+
614+
This dunder method is primariy used for distributed checkpoint purpose.
615+
616+
Returns:
617+
A List[:class:`ChunkStorageMetadata`] object that represents the shard size/offset on the current rank.
618+
"""
609619
from torch.distributed.checkpoint.planner_helpers import (
610620
_create_chunk_from_dtensor,
611621
)

0 commit comments

Comments
 (0)