[WIP][Full Dtensor] Work with full dtensor fully_shard by fegin · Pull Request #2616 · pytorch/torchtitan

fegin · 2026-03-17T21:35:16Z

Stack from ghstack (oldest at bottom):

NOT READY YET
NOT READY YET

This PR is still a WIP, so code quality and design are not finalized.

We publish this PR to get early signals of CIs and also verify pytorch/pytorch#176334.

[ghstack-poisoned]

**NOT READY YET** This PR is still a WIP, so code quality and design are not finalized. We publish this PR to get early signals of CIs and also verify pytorch/pytorch#176334. ghstack-source-id: fc1597e Pull-Request: #2616

[ghstack-poisoned]

**NOT READY YET** This PR is still a WIP, so code quality and design are not finalized. We publish this PR to get early signals of CIs and also verify pytorch/pytorch#176334. ghstack-source-id: 184da4a Pull-Request: #2616

…mesh (#176334) **Summary** Enable `fully_shard` to operate on models whose parameters are already DTensors distributed across a full SPMD mesh, including the data-parallel dimensions. Previously, `fully_shard` only handled the case where DP dimensions were not part of an existing DTensor mesh (which is TP or EP). This PR introduces `DataParallelMeshDims`, which lets users specify which mesh dimensions correspond to data parallelism (sharding and/or replication). The DP dimensions must carry `Replicate` placements in the original DTensor spec before `fully_shard` replaces them with `Shard`. Note that, the input/activation will also be DTensors distributed across the full SPMD mesh. **Verification** We verify this PR through unittests and TorchTitan integration: pytorch/torchtitan#2616. Pull Request resolved: #176334 Approved by: https://github.com/weifengpy

Update

901616b

[ghstack-poisoned]

fegin requested review from tianyu-l, wconstab and wwwjn as code owners March 17, 2026 21:35

pytorch-bot bot added the ciflow/8gpu label Mar 17, 2026

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 17, 2026

fegin marked this pull request as draft March 17, 2026 21:35

fegin mentioned this pull request Mar 17, 2026

[fully_shard][DTensor] Support fully_shard with DTensor on full SPMD mesh pytorch/pytorch#176334

Closed

Update

6a05651

[ghstack-poisoned]

fegin mentioned this pull request Mar 18, 2026

[DONT LAND] Full Dtensor fully_shard + Local Map + FlexAttention #2621

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][Full Dtensor] Work with full dtensor fully_shard#2616

[WIP][Full Dtensor] Work with full dtensor fully_shard#2616
fegin wants to merge 2 commits intogh/fegin/100/basefrom
gh/fegin/100/head

fegin commented Mar 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fegin commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fegin commented Mar 17, 2026 •

edited

Loading