Trainer Loss #146

pbontrager · 2025-09-10T22:22:12Z

Updates trainer for GRPO Loss
Update replay buffer to do batching
Update rl app test to do one training step (This app is meant for testing)

The goal of this PR is to enable titan training as well as keep the data definition and the loss easily accessible to the user. The loss, data type, and collation are left to the user so they can co-design all three to their specific data and are less likely to have to touch the trainer.

joecummings

Is the intention to integrate this into the GRPO app or leave that to me?

Also - how is the loss passed in ?

Jack-Khuu

lgtm, Accept to unblock

We can follow up with a GRPO PR

Jack-Khuu · 2025-09-15T16:40:53Z

src/forge/actors/trainer.py

-
+    def train_step(
+        self, inputs: list[dict[Tensor]], targets: list[dict[Tensor]]
+    ) -> None:


Update return type

Jack-Khuu · 2025-09-15T16:47:29Z

apps/rl/main.py

+        return tensor
+
+
+def collate(batches: list[list[Episode]]):


return type?

joecummings

Delete the commented out code please

allenwang28 · 2025-09-15T17:45:04Z

apps/rl/main.py

+
+    print("Collecting Data...")
+    g = torch.manual_seed(0)
+    global_batch_size = cfg.replay_buffer.batch_size * cfg.replay_buffer.dp_size


no need to change right now, but wouldn't the global batch size technically be a trainer config and not a replay buffer config? Is it possible to tell omegaconf like "make this value the same as the one defined elsewhere"?

Not a trainer config anymore since replay_buffer is it's own service and the trainer doesn't handle data or loading anymore

LucasLLC

Merging to unblock, please follow up on nits

trainer updates and rl test

d59f406

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 10, 2025

pbontrager added 7 commits September 12, 2025 08:53

basic debugging

2d32e74

finished debugging

20c5728

Refactor loss

7663ef4

Merge branch 'main' into trainer2

e0454be

fixed import bug

2e5e4e5

Merge branch 'main' into trainer2

a996750

merge fixes

7e38c68

pbontrager marked this pull request as ready for review September 12, 2025 23:07

joecummings reviewed Sep 15, 2025

View reviewed changes

allenwang28 mentioned this pull request Sep 15, 2025

[RFC] Defining core abstractions #149

Closed

Jack-Khuu approved these changes Sep 15, 2025

View reviewed changes

joecummings approved these changes Sep 15, 2025

View reviewed changes

allenwang28 reviewed Sep 15, 2025

View reviewed changes

allenwang28 approved these changes Sep 15, 2025

View reviewed changes

LucasLLC approved these changes Sep 15, 2025

View reviewed changes

LucasLLC merged commit cfd9677 into main Sep 15, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Trainer Loss #146

Trainer Loss #146

Uh oh!

pbontrager commented Sep 10, 2025 •

edited

Loading

Uh oh!

joecummings left a comment

Uh oh!

Jack-Khuu left a comment

Uh oh!

Jack-Khuu Sep 15, 2025

Uh oh!

Jack-Khuu Sep 15, 2025

Uh oh!

joecummings left a comment

Uh oh!

allenwang28 Sep 15, 2025

Uh oh!

pbontrager Sep 15, 2025 •

edited

Loading

Uh oh!

LucasLLC left a comment •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Trainer Loss #146

Trainer Loss #146

Uh oh!

Conversation

pbontrager commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joecummings left a comment

Choose a reason for hiding this comment

Uh oh!

Jack-Khuu left a comment

Choose a reason for hiding this comment

Uh oh!

Jack-Khuu Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

Jack-Khuu Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

joecummings left a comment

Choose a reason for hiding this comment

Uh oh!

allenwang28 Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

pbontrager Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LucasLLC left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pbontrager commented Sep 10, 2025 •

edited

Loading

pbontrager Sep 15, 2025 •

edited

Loading

LucasLLC left a comment •

edited

Loading