Add weight tying support for Llama3 by dean-mccoppin · Pull Request #2580 · pytorch/torchtitan

dean-mccoppin · 2026-03-15T18:05:27Z

Implements enable_weight_tying for Llama3, sharing tok_embeddings.weight with output.weight. It mirrors the Qwen3 implementation from #1590 (thanks!)

Changes cover model.py (config field, tying in init/init_weights, PP guard), parallelize.py (grouped FSDP unit for tied params), state_dict_adapter.py (skip/reconstruct output.weight for HF conversion), and a new unit test file

Closes #1524

Ties tok_embeddings.weight to output.weight via enable_weight_tying config flag. Follows the same pattern as Qwen3 (pytorch#1590). Closes pytorch#1524.

tianyu-l

IIUC the existing model registry doesn't have llama3.2 1B / 3B models, which are the only variants which have weight-tying enabled. Please add those models to llama3/__init__.py. You can refer to the exact config in earlier attempt #1376

tianyu-l · 2026-03-15T22:11:44Z

tests/unit_tests/test_weight_tying.py

+
+    HAS_TORCHTITAN_MODELS = True
+except Exception:
+    HAS_TORCHTITAN_MODELS = False


what's this for?

my reasoning was because torchtitan/models/common/init.py re-exports from moe/, which imports triton at module level, and if triton isnt installed then the import chain fails. but i did just notice that many existing unit tests import from the same torchtitan.models.common.* submodules with no guard and they pass in CI, so I'll be removing this

Llama 3.2 1B and 3B are the only Llama variants with weight tying, so they belong in the registry. Without them the feature has no real entry point. Also dropped the try/except guard in test_weight_tying.py, which was inconsistent with every other unit test here and silently skips on broken imports.

Add weight tying support for Llama3

8fc6a92

Ties tok_embeddings.weight to output.weight via enable_weight_tying config flag. Follows the same pattern as Qwen3 (pytorch#1590). Closes pytorch#1524.

dean-mccoppin requested review from fegin, tianyu-l, wconstab and wwwjn as code owners March 15, 2026 18:05

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 15, 2026

tianyu-l requested changes Mar 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add weight tying support for Llama3#2580

Add weight tying support for Llama3#2580
dean-mccoppin wants to merge 2 commits intopytorch:mainfrom
dean-mccoppin:feat/llama3-weight-tying

dean-mccoppin commented Mar 15, 2026

Uh oh!

tianyu-l left a comment

Uh oh!

tianyu-l Mar 15, 2026

Uh oh!

dean-mccoppin Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dean-mccoppin commented Mar 15, 2026

Uh oh!

tianyu-l left a comment

Choose a reason for hiding this comment

Uh oh!

tianyu-l Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

dean-mccoppin Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants