Skip to content

Conversation

JenniferWang
Copy link
Contributor

Very similar to the stand-alone vllm app, this trainer app is introduced to make investigating trainer OOM faster. This could be very useful for single-node trainer because you can run it locally and the system metrics are much easier to obtain.

Test

Change the activation checkpointing config in apps/grpo/qwen3_32b.yaml and reproduce the OOM.

  activation_checkpoint:
    mode: selective
    selective_ac_option: op

The repro is blazingly fast :)

https://meta.wandb.io/jiyue/grpo-training/runs/8qe73q1b?nw=nwuserjiyue

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 4, 2025
@JenniferWang JenniferWang marked this pull request as ready for review October 4, 2025 02:47
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.

# Usage: python -m apps.trainer.main --config apps/grpo/qwen3_32b.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, can we re-name this to rl_trainer? The reason is because we have sft, sft_v2 and then if we have a separate trainer it may confuse even further lol

Copy link
Contributor

@allenwang28 allenwang28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just file name change, but looks good to me!

We should consider in the near future another location for this and everything that isn't GRPO, these should probably formulate the basis of many of our integration tests

@JenniferWang JenniferWang merged commit 61c9775 into main Oct 6, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants