Flatten`rl` directory: remove `vllm_compat`, consolidate `unified` by wwwjn · Pull Request #2618 · pytorch/torchtitan

wwwjn · 2026-03-17T21:58:32Z

rl/

Move all files from rl/unified/ directly under rl/ (actors, models, scripts, etc.)
Remove rl/vllm_compat/ entirely (unused by unified code)
Rename types.py -> rl_types.py to avoid shadowing Python stdlib types module
Fix vllm.model_executor.layers.attention.Attention import for newer vLLM
Update experiment registry: rl.unified -> rl
Update all internal imports and README paths
Add rl_grpo_qwen3_0_6b_tp1 config for TP=1 testing

felipemello1

NOTE: TP=1 works, and TP=2 fails with RoPE cache + compile. Debugging before landing this PR.

thanks!

Approving to unblock. Since you are moving files around, my 2c is that we should have something like:

experiments/rl/
	|-- actors
	....
	|--experiments
		|-- two_sum
		|--- gsm8k

And not have two_sum at the the root level of '/rl'

wwwjn · 2026-03-17T22:38:42Z

NOTE: TP=1 works, and TP=2 fails with RoPE cache + compile. Debugging before landing this PR.

thanks!

Approving to unblock. Since you are moving files around, my 2c is that we should have something like:
experiments/rl/
	|-- actors
	....
	|--experiments
		|-- two_sum
		|--- gsm8k
And not have two_sum at the the root level of '/rl'

Good suggestion, let me move things around

tianyu-l

one comment

torchtitan/experiments/rl/types.py

wwwjn · 2026-03-18T17:47:40Z

NOTE: TP=1 works, and TP=2 fails with RoPE cache + compile. Debugging before landing this PR.

thanks!

Approving to unblock. Since you are moving files around, my 2c is that we should have something like:
experiments/rl/
	|-- actors
	....
	|--experiments
		|-- two_sum
		|--- gsm8k
And not have two_sum at the the root level of '/rl'

I updated it to be

	|--tasks
 		|-- two_sum
 		|--- gsm8k

As the whole rl folder is under the experiment/ folder, repeated name it confusing. Wdyt @felipemello1 @tianyu-l

felipemello1 · 2026-03-18T17:52:56Z

i think that 'tasks', 'projects' or 'recipes' would be fine. My only "con" against task is that a model can be trained on multiple tasks, e.g. coding, websearch, etc. I feel like 'project' or 'recipe' would more descriptive. But it shouldnt be a big deal either way. You could ask in the rl group if someone feels strongly about it. Your call!

daniellepintz

IMO I really don't think we need the extra tasks/sum_digits directories, I think it's simpler to just have a top level simple_grpo.py file. The path gets very long which is not the best user experience IMO, and this is the controller file, so is pretty important and would prefer if it's not so nested

wwwjn · 2026-03-18T18:16:51Z

this is the controller file

My thought is the controller file it not generalized, and it's closely tied to sum digits task now. We should explicitly express this limitation in our file names / file structure. Once we have enough knowledge to have a abstraction on generalizable controller, we can move it outside of sum_digits task

daniellepintz · 2026-03-18T18:21:10Z

it is tied to sum digits, although not that closely imo, i still think it's fairly generalizable. If we want to keep sum digits in the name that's okay, but I would just prefer it's not super nested. But if no one else shares that opinion also okay : )

felipemello1 · 2026-03-18T18:22:55Z

sum_digits is just one project. With time, we may have: 'gsm8k', 'web_search', 'DPO', 'coding', etc. When that happens, we need to have a place to put them. Each of them would have their own 'grader.py', 'data.py', 'main.py'. This is what i have seen in all RL libraries as well. Some examples:

https://github.com/thinking-machines-lab/tinker-cookbook/tree/main/tinker_cookbook/recipes
https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples
https://github.com/verl-project/verl/tree/main/examples

joecummings · 2026-03-18T18:39:46Z

Right now, yes the main controller doesn't have an amount of task-specific information in it that would be difficult to remove. It could be generalized at this point.

However, splitting this into recipes gives us (and users) a ton of flexibility to experiment with different RL optimizations and tasks. For this reason, atm I'm in favor of recipes (definitely not calling it tasks though).

My only caveat is making is clear that recipes should not encourage the proliferation of every possible RL technique under the sun - let's keep things focused aligned with the intention of titan.

… rl/ - Move all files from rl/unified/ directly under rl/ (actors, models, scripts, etc.) - Remove rl/vllm_compat/ entirely (unused by unified code) - Rename types.py -> rl_types.py to avoid shadowing Python stdlib types module - Fix vllm.model_executor.layers.attention.Attention import for newer vLLM - Update experiment registry: rl.unified -> rl - Update all internal imports and README paths - Add rl_grpo_qwen3_0_6b_tp1 config for TP=1 testing

Leftover README after #2618

pytorch-bot bot added the ciflow/8gpu label Mar 17, 2026

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 17, 2026

felipemello1 approved these changes Mar 17, 2026

View reviewed changes

tianyu-l approved these changes Mar 17, 2026

View reviewed changes

torchtitan/experiments/rl/types.py Show resolved Hide resolved

wwwjn force-pushed the rl-merge branch from f1a5677 to bf9ef7b Compare March 18, 2026 09:04

daniellepintz reviewed Mar 18, 2026

View reviewed changes

wwwjn force-pushed the rl-merge branch from 5aa7d02 to 8468cc6 Compare March 18, 2026 19:13

daniellepintz approved these changes Mar 18, 2026

View reviewed changes

wwwjn added 6 commits March 18, 2026 12:34

Update README and remove rl_grpo_qwen3_0_6b_tp1 config

ab6b14f

rename types

378fc67

rename to task

62e844b

add tasks folder

a160e09

remove tasks folder

e73c4f9

wwwjn force-pushed the rl-merge branch from 8468cc6 to e73c4f9 Compare March 18, 2026 19:34

wwwjn merged commit ea614ba into main Mar 18, 2026
15 of 23 checks passed

daniellepintz mentioned this pull request Mar 19, 2026

[RL] Remove redundant unified/README.md #2630

Merged

daniellepintz added a commit that referenced this pull request Mar 19, 2026

[RL] Remove redundant unified/README.md (#2630)

f9fcdc1

Leftover README after #2618

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flatten`rl` directory: remove `vllm_compat`, consolidate `unified` #2618

Flatten`rl` directory: remove `vllm_compat`, consolidate `unified` #2618
wwwjn merged 6 commits intomainfrom
rl-merge

wwwjn commented Mar 17, 2026

Uh oh!

felipemello1 left a comment •

edited

Loading

Uh oh!

wwwjn commented Mar 17, 2026

Uh oh!

tianyu-l left a comment

Uh oh!

Uh oh!

wwwjn commented Mar 18, 2026

Uh oh!

felipemello1 commented Mar 18, 2026 •

edited

Loading

Uh oh!

daniellepintz left a comment

Uh oh!

wwwjn commented Mar 18, 2026

Uh oh!

daniellepintz commented Mar 18, 2026

Uh oh!

felipemello1 commented Mar 18, 2026 •

edited

Loading

Uh oh!

joecummings commented Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

wwwjn commented Mar 17, 2026

Uh oh!

felipemello1 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wwwjn commented Mar 17, 2026

Uh oh!

tianyu-l left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wwwjn commented Mar 18, 2026

Uh oh!

felipemello1 commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

daniellepintz left a comment

Choose a reason for hiding this comment

Uh oh!

wwwjn commented Mar 18, 2026

Uh oh!

daniellepintz commented Mar 18, 2026

Uh oh!

felipemello1 commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joecummings commented Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

felipemello1 left a comment •

edited

Loading

felipemello1 commented Mar 18, 2026 •

edited

Loading

felipemello1 commented Mar 18, 2026 •

edited

Loading