Change services to actors except Policy and drop apps/rl #231

DNXie · 2025-09-24T22:22:31Z

Within all of our applications, everything is currently a Service. For example, in apps/grpo/main, trainer and replay_buffer should be actors.

Summary:

Actors: dataloader, trainer, replay_buffer, compute_advantages
Services: Policy, ref_model, reward_actor

Changes:

Updated apps/grpo, apps/toy_rl (with dcp off)
Updated Policy to take use_dcp from config.
Dropped apps/rl since it is deprecated.
Single actor call call_one instead of choose. The difference is that call_one makes sure the caller is a singleton.

Test

python -m apps.grpo.main --config apps/grpo/qwen3_8b.yaml
python -m apps.grpo.main --config apps/grpo/qwen3_1_7b.yaml
python -m apps.toy_rl.sumdigits --config apps/toy_rl/sumdigits.yaml

I didn't run

python -m apps.grpo.main --config apps/grpo/qwen3_multinode.yaml

since the config is outdated. Looks like this one is already deprecated.

cc @Ritesh1905

casteryh · 2025-09-25T21:36:59Z

Would love some clarification on service vs policy.
Especially why we have this distinction at all.
cc @allenwang28

allenwang28 · 2025-09-25T22:54:39Z

Would love some clarification on service vs policy. Especially why we have this distinction at all.

I am going to assume you mean service vs actor! It's a good question. For context, the main reason we have services in the first place is to exactly handle load balancing and fault tolerance. These are things which Monarch doesn't give you out of the box, but gives you the capabilities to implement. We surely want this for something like vLLM, or if we want the ability to spin up and load balance across multiple execution environments in-band.

We initially had everything just be services for simplicity, but in reality not everything needs to be a service. For replay buffer, trainer, etc., you don't need the routing capabilities. Additionally, the fault tolerance story for those are not as well defined as that of the policy/environments/reference model.

Since the world will also learn about Monarch, I think the layering is clearer this way - Actors are the base capabilities from Monarch, Service is a distinct abstraction that's built on top of it.

allenwang28 · 2025-09-25T22:55:32Z

apps/grpo/qwen3_1_7b.yaml

    procs: 1
-    num_replicas: 1
    with_gpus: false
  ref_model:


actually I was wrong - I think we only have trainer and replay buffer be actors, the rest are ok to keep as services

It makes sense. Why are dataset, compute_advantages, and reward_actor also services? Do they need replicas?

If num_replica=1, what's the difference between actor and service?

sorry Danning, Dataset is an actor, compute_advantages is an actor, reward_actor is a service

…ervice2actor

allenwang28

lgtm let's just use call_one() wherever appropriate

apps/grpo/main.py

allenwang28

lgtm let's just use call_one() wherever appropriate

DNXie · 2025-09-29T18:27:48Z

apps/toy_rl/sumdigits.py

 import torch
 import torch.nn.functional as F
 import torchstore as ts
+from forge.actors._torchstore_utils import get_param_key


Fixed the typo here. CC @casteryh

grpo main

c91649b

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 24, 2025

DNXie added 6 commits September 24, 2025 15:24

move service config to actor section

e7ba81e

update toyrl

df2e04c

app/rl

e9fb976

fix rl/main

17c7b5b

fix rl.main

5e79423

fix toy_rl, drop rl

fca8156

DNXie changed the title ~~[WIP] Change services to actors except Policy~~ Change services to actors except Policy and drop apps/rl Sep 25, 2025

DNXie requested a review from allenwang28 September 25, 2025 19:33

Merge remote-tracking branch 'origin/main' into app_service2actor

2a9870a

DNXie requested a review from Ritesh1905 September 25, 2025 19:52

allenwang28 reviewed Sep 25, 2025

View reviewed changes

only trainer and rb are actors

5aa4029

DNXie requested a review from allenwang28 September 26, 2025 04:55

DNXie and others added 6 commits September 25, 2025 21:56

for easier comparison

ff7b435

Merge branch 'main' into app_service2actor

01c5b64

change dataloader etc to actor

5e521d0

Merge branch 'app_service2actor' of github.com:DNXie/forge into app_s…

6b00fcb

…ervice2actor

fix lint

bc95d9c

Merge branch 'main' into app_service2actor

15dbedd

allenwang28 approved these changes Sep 29, 2025

View reviewed changes

apps/grpo/main.py Outdated Show resolved Hide resolved

apps/grpo/main.py Outdated Show resolved Hide resolved

apps/grpo/main.py Outdated Show resolved Hide resolved

apps/grpo/main.py Outdated Show resolved Hide resolved

apps/grpo/main.py Outdated Show resolved Hide resolved

use call_one instead of choose

1d07ede

allenwang28 approved these changes Sep 29, 2025

View reviewed changes

fix toy_rl errors.

de7a47a

DNXie commented Sep 29, 2025

View reviewed changes

DNXie merged commit 510a523 into meta-pytorch:main Sep 29, 2025
5 checks passed

Change services to actors except Policy and drop apps/rl #231

Change services to actors except Policy and drop apps/rl #231

Uh oh!

Conversation

DNXie commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

casteryh commented Sep 25, 2025

Uh oh!

allenwang28 commented Sep 25, 2025

Uh oh!

allenwang28 Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

DNXie Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

DNXie Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

DNXie Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

allenwang28 Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

allenwang28 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

allenwang28 left a comment

Choose a reason for hiding this comment

Uh oh!

DNXie Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

casteryh Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DNXie commented Sep 24, 2025 •

edited

Loading