dynamically determine num_layers in _qwen3_hf_to_vllm #199

casteryh · 2025-09-19T22:00:55Z

This is fixing #190

Ritesh1905 · 2025-09-20T04:01:00Z

src/forge/actors/trainer.py

        hf_state_dict = self.engine.checkpointer.sd_adapter.to_hf(flattened_state_dict)
        # TODO: Figure out how to gracefully handle which model to-vLLM conversion is needed
-        vllm_ready_hf_sd = _qwen3_hf_to_vllm(sd=hf_state_dict, num_layers=28)
+        vllm_ready_hf_sd = _qwen3_hf_to_vllm(sd=hf_state_dict)


Could this just be simplified by using num_layers=self.model.config.num_hidden_layers? see this

Ideally you should not be needing this method in the trainer at all. The trainer should be agnostic to the type/arch of generator.

this should work with torchtitan:

num_layers=self.engine.model_args.n_layers)

I guess simply reading the max num layers from regex has the advantage that it's agnostic of the trainer implementation as long as the state dict is in huggingface format.
Let me know what you think. @Ritesh1905 @allenwang28

casteryh · 2025-09-23T21:23:26Z

closed since the change is already in #215

dynamically determine num_layers

7fd5293

casteryh requested review from allenwang28 and joecummings September 19, 2025 22:00

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 19, 2025

Ritesh1905 reviewed Sep 20, 2025

View reviewed changes

casteryh closed this Sep 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

dynamically determine num_layers in _qwen3_hf_to_vllm #199

dynamically determine num_layers in _qwen3_hf_to_vllm #199

Uh oh!

casteryh commented Sep 19, 2025 •

edited

Loading

Uh oh!

Ritesh1905 Sep 20, 2025

Uh oh!

allenwang28 Sep 20, 2025

Uh oh!

casteryh Sep 22, 2025

Uh oh!

casteryh commented Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dynamically determine num_layers in _qwen3_hf_to_vllm #199

dynamically determine num_layers in _qwen3_hf_to_vllm #199

Uh oh!

Conversation

casteryh commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ritesh1905 Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

allenwang28 Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

casteryh Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

casteryh commented Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

casteryh commented Sep 19, 2025 •

edited

Loading